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GLYCOSIDASE ENZYMES 
BACKGROUND OF THE INVENTION 

1 . Field of the Inventions 

This invention relates to newly identified polynucleotides, polypeptides encoded by 
such polynucleotides, the use of such polynucleotides and polypeptides, as well as the 
production and isolation of such polynucleotides and polypeptides. More particularly, the 
polynucleotides and polypeptides of the present invention has been putatively identified as 
glucosidases, a-galactosidases, P-galactosidases, B-mannosidases, fl-mannanases, 
endoglucanases, and pullalanases. 

2. Description of Related Art 

The glycosidic bond of P-galactosides can be cleaved by different classes of 
enzymes: (i) phospho-P-galactosidases (EC3.2.1.85) are specific for a phosphorylated 
substrate generated via phosphoenolpyruvate phosphotransferase system (PTS)-dependent 
uptake; (ii) typical p-galactosidases (EC 3.2.1.23), represented by the Escherichia coli LacZ 
enzyme, which are relatively specific for P-galactosides: and (iii) p-glucosidases (EC 
3.2.1.21) such as the enzymes of Agrobacterium faecalis, Clostridium thermocellum, 
Pyrococcus furiosus or Sulfolobus solfataricus (Day, A.G. and Withers, S.G., (1986) 
Purification and characterization of a P-glucosidase from Alcaligenes faecalis. Can. J. 
Biochem. Cell. Biol. 64, 914-922; Kengen, S.W.M., et aL (1993) Eur. J. Biochem., 213, 
305-312; Ait, R, Cruezet, N. and Cattaneo, J. (1982) Properties of P-glucosidase purified 
from Clostridium thermocellum. J. Gen. Microbiol. 128, 569-577; Grogan, D.W. (1991) 
Evidence that p-galactosidase of Sulfolobus solfataricus is only one of several activities of 
a thermostable p-D-glycodiase. Appl. Environ. Microbiol. 57, 1644-1649). Members of 
the latter group, although highly specific with respect to the P-anomeric configuration of 
the glycosidic linkage, often display a rather relaxed substrate specificity and hydrolyze P- 
glucosides as well as P-fucosides and p-galactosides. 
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Generally, a-galactosidases are enzymes that catalyze the hydrolysis of galactose 
groups on a polysaccharide backbone or hydrolyze the cleavage of di- or oligosaccharides 
comprising galactose. 

Generally, 13-mannanases are enzymes that catalyze the hydrolysis of mannose 
groups internally on a polysaccharide backbone or hydrolyze the cleavage of di- or 
oligosaccaharides comprising mannose groups. B-mannosidases hydrolyze non-reducing, 
terminal mannose residues on a mannose-containing polysaccharide and the cleavage of di- 
or oligosaccaharides comprising mannose groups. 

Guar gum is a branched galactomannan polysaccharide composed of P- 1,4 linked 
mannose backbone with a-1 ,6 linked galactose side chains. The enzymes required for the 
degradation of guar are p-mannanase, P-mannosidase and a-galactosidase. P-mannanase 
hydrolyses the mannose backbone internally and p-mannosidase hydrolyses non-reducing, 
terminal mannose residues, a-galactosidase hydrolyses a-linked galactose groups. 

Galactomannan polysaccharides and the enzymes that degrade them have a variety 
of applications. Guar is commonly used as a thickening agent in food and is utilized in 
hydraulic fracturing in oil and gas recovery. Consequently, galactomannanases are 
industrially relevant for the degradation and modification of guar. Furthermore, a need 
exists for thermostable galactomannases that are active in extreme conditions associated 
with drilling and well stimulation. 

There are other applications for these enzymes in various industries, such as in the 
beet sugar industry. 20-30% of the domestic U.S. sucrose consumption is sucrose from 
sugar beets. Raw beet sugar can contain a small amount of raffinose when the sugar beets 
are stored before processing and rotting begins to set in. Raffinose inhibits the 
crystallization of sucrose and also constitutes a hidden quantity of sucrose. Thus, there is 
merit to eliminating raffinose from raw beet sugar. a-Galactosidase has also been used as 
a digestive aid to break down raffinose, stachyose, and verbascose in such foods as beans 
and other gassy foods. 

P-galactosidases which are active and stable at high temperatures appear to be 
superior enzymes for the production of lactose-free dietary milk products (Chaplin, M.F. 
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and Bucke, C. (1990) In: Enzyme Technology, pp. 159-160, Cambridge University Press, 
Cambridge,, UK). Also, several studies have demonstrated the applicability of p- 
galactosidases to the enzymatic synthesis of oligosaccharides via transglycosylation 
reactions (Nilsson, K.G.I. (1988) Enzymatic synthesis of oligosaccharides. Trends 
Biotechnol. 6, 156-264; Cote, G.L. and Tao, B.Y. (1990) Oligosaccharide synthesis by 
enzymatic transglycosylation. Glycoconjugate J. 7, 145-162). Despite the commercial 
potential, only a few p-galactosidases of thermophiles have been characterized so far. Two 
genes reported are P-galactoside-cleaving enzymes of the hyperthermophilic bacterium 
Thermotoga maritima, one of the most thermophilic organotrophic eubacteria described to 
date (Huber, R. ? Langworthy, T.A., Konig, H., Thomm, M, Woese, C.R., Sleytr, U.B. and 
S tetter, K.O. (1986) T. martima sp. nov. represents a new genus of unique extremely 
thermophilic eubacteria growing up to 90°C, Arch. Microbiol. 144, 324-333) one of the 
most thermophilic organotrophic eubacteria described to date. The gene products have been 
identified as a P-galactosidase and a P-glucosidase. 

Pullulanase is well known as a debranching enzyme of pullulan and starch. The 
enzyme hydrolyzes a-l,6-glucosidic linkages on these polymers. Starch degradation for 
the production or sweeteners (glucose or maltose) is a very important industrial application 
of this enzyme. The degradation of starch is developed in two stages. The first stage 
involves the liquefaction of the substrate with a-amylase, and the second stage, or 
saccharification stage, is performed by B-amylase with pullalanase added as a debranching 
enzyme, to obtain better yields. 

Endoglucanases can be used in a variety of industrial applications. For instance, the 
endoglucanases of the present invention can hydrolyze the internal B-l,4-glycosidic bonds 
in cellulose, which may be used for the conversion of plant biomass into fuels and 
chemicals. Endoglucanases also have applications in detergent formulations, the textile 
industry, in animal feed, in waste treatment, and in the fruit juice and brewing industry for 
the clarification and extraction of juices. 
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Brief Description of the Drawings 

The following drawings are illustrative of embodiments of the invention and are not 
meant to limit the scope of the invention as encompassed by the claims. 

Figures la-b are the full-length DNA and corresponding deduced amino acid 
sequence of M11TL of the present invention. Sequencing was performed using a 378 
automated DNA sequencer for all sequences of the present invention (Applied Biosystems, 
Inc.). 

Figure 2 is an illustration of the full-length DNA and corresponding deduced amino 
acid sequence of OC1/4V-33B/G. 

Figure 3 is an illustration of the full-length DNA and corresponding deduced amino 
acid sequence of F1-12G. 

Figures 4a-b are the full-length DNA and corresponding deduced amino acid 
sequence of 9N2-31B/G. 

Figures 5a-b are the full-length DNA and corresponding deduced amino acid 
sequence of MSB8-6G. 

Figure 6 is the full-length DNA and corresponding deduced amino acid sequence 
of AEDII 1 2RA- 1 8B/G . 

Figures 7a-b are the full-length DNA and corresponding deduced amino acid 
sequence of GC74-22G. 

Figures 8a-b are the full-length DNA and corresponding deduced amino acid 
sequence of VC1-7G1. 

Figures 9a-c are the full-length DNA and corresponding deduced amino acid 
sequence of 37GP1. 

Figures lOa-c are the full-length DNA and corresponding deduced amino acid 
sequence of 6GC2. 

Figures lla-d are the full-length DNA and corresponding deduced amino acid 
sequence of 6GP2. 

Figures 12a-c are the full-length DNA and corresponding deduced amino acid 
sequence of 63GB1. 
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Figures 13a-b are the full-length DNA and corresponding deduced amino acid 
sequence of OC1/4V. 

Figures 14a-e are the full-length DNA and corresponding deduced amino acid 
sequence of 6GP3. 

Figures 15a-d are the full-length DNA and corresponding deduced amino acid 

sequence of Thermotoga maritima MSB8-6GP2. 

Figures 16a-c are the full-length DNA and corresponding deduced amino acid 
sequence of Thermotoga maritima MSB8-6GB4. 

Figures 17a-d are the full-length DNA and corresponding deduced amino acid 
sequence ofBanki gouldi 37GP4. 

Figures 18a-b are the full-length DNA and corresponding deduced amino acid- 
sequence of Pyrococcus furiosus VC 1 -7EG 1 . 

SUMMARY OF THE INVENTION 

In a preferred embodiment of the present invention, there are provided isolated 
nucleic acids (polynucleotides) which encode mature enzymes having the deduced amino 
acid sequences of Figures 1-18 (SEQ ID NOS: 15-28 and 61-64). 

In another embodiment, the invention provides a method for producing a 
polypeptide including culturing host cells containing the polynucleotide of Figures 1-18 and 
expressing from the host cell a polypeptide encoded by the polynucleotide and isolating the 
polypeptide. 

In another embodiment, the invention provides an enzyme selected from the group 
consisting of an enzyme having an amino acid sequence set forth in SEQ ID NOS: 15-28 
or 61-64 and an enzyme which has at least 30 consecutive amino acid residue as an enzyme 
having an amino acid sequence set forth in SEQ ID NOS: 1 5-28 or 61-64. 

In yet another embodiment, the invention provides a method for generating glucose 
from soluble cell oligosaccharides which includes contacting a sample containing 
oligosaccharides with an effective amount of an enzyme selected from the group of 
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enzymes having the amino acid sequence set forth in SEQ ID NOS: 15-28, 61-63 and 64 
such that glucose is produced 

The publications discussed herein are provided solely for their disclosure prior to 
the filing date of the present application. Nothing herein is to be construed as an 
admission that the invention is not entitled to antedate such disclosure by virtue of prior 
invention. 

Definitions 

"Monosaccharide", as used herein, refers to a single polyhydroxy aldehyde or 
ketone unit. 

"Oligosaccharide", as used herein, consist of short chains of monosaccharide units 
joined together by covalent bonds. Of these, the most abundant are the disaccharides, 
which have two monosaccharide units. 

"Polysaccharide", as used herein, consists of long chains having many 
monosaccharide units. 

The term "gene" means the segment of DNA involved in producing a polypeptide 
chain; it includes regions preceding and following the coding region (leader and trailer) as 
well as intervening sequences (introns) between individual coding segments (exons). 

A coding sequence is "operably linked to" another coding sequence when RNA 
polymerase will transcribe the two coding sequences into a single mRNA, which is then 
translated into a single polypeptide having amino acids derived from both coding 
sequences. The coding sequences need not be contiguous to one another so long as the 
expressed sequences ultimately process to produce the desired protein. 

"Recombinant" enzymes refer to enzymes produced by recombinant DNA 
techniques; i. e. , produced from cells transformed by an exogenous DNA construct encoding 
the desired enzyme. "Synthetic" enzymes are those prepared by chemical synthesis. 

A DNA "coding sequence of or a "nucleotide sequence encoding" a particular 
enzyme, is a DNA sequence which is transcribed and translated into an enzyme when 
placed under the control of appropriate regulatory sequences. 
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Detailed Description of the Invention 

The polynucleotides and polypeptides of the present invention have been identified 
as glucosidases. a-galactosidases 5 P-galactosidases, B-mannosidases, fl-mannanases, 
endoglucanases ; and pullalanases as a result of their enzymatic activity. 

In accordance with one aspect of the present invention, there are provided novel 
enzymes, as well as active fragments, analogs and derivatives thereof. 

In accordance with another aspect of the present invention, there are provided 
isolated nucleic acid molecules encoding the enzymes of the present invention including 
mRNAs, cDNAs, genomic DNAs as well as active analogs and fragments of such enzymes. 

In accordance with yet a further aspect of the present invention, there is provided 
a process for producing such polypeptides by recombinant techniques comprising culturmg 
recombinant prokaryotic and/or eukaryotic host cells, containing a nucleic acid sequence 
of the present invention, under conditions promoting expression of said enzymes and 
subsequent recovery of said enzymes. 

In accordance with yet a further aspect of the present invention, there is provided 
a process for utilizing such enzymes, or polynucleotides encoding such enzymes for 
hydro lyzing lactose to galactose and glucose for use in the food processing industry, the 
pharmaceutical industry, for example, to treat intolerance to lactose, as a diagnostic reporter 
molecule, in com wet milling, in the fruit juice industry, in baking, in the textile industry 
and in the detergent industry. 

In accordance with yet a further aspect of the present invention, there is provided 
a process for utilizing such enzymes for hydrolyzing guar gum (a galactomannan 
polysaccharide) to remove non-reducing terminal mannose residues. Further 
polysaccharides such as galactomannan and the enzymes according 10 the invention that 
degrade them have a variety of applications. Guar gum is commonly used as a thickening 
agent in food and also is utilized in hydraulic fracturing in oil and gas recovery. 
Consequently, mannanases are industrially relevant for the degradation and modification 
of guar gums. Furthermore, a need exists for thermostable mannases that are active in 
extreme conditions associated with drilling and well stimulation. 
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In accordance with yet a further aspect of the present invention, there are also 
provided nucleic acid probes comprising nucleic acid molecules of sufficient length to 
specifically hybridize to a nucleic acid sequence of the present invention. 

In accordance with yet a further aspect of the present invention, there is provided 
? process for utilizing such enzymes, or polynucleotides encoding such enzymes, for in 
vitro purposes related to scientific research, for example, to generate probes for identifying 
similar sequences which might encode similar enzymes from other organisms by using 
certain regions, i.e.. conserved sequence regions, of the nucleotide sequence. 

These and other aspects of the present invention should be apparent to those skilled 
in the art from the teachings herein. 

The polynucleotides of this invention were originally recovered from genomic gene 
libraries derived from the following organisms: 

M11TL is a new species of Desulfttrococcus isolated from Diamond Pool in 
Yellowstone National Park. The organism grows optimally at 85-88 °C, pH 7.0 in a low salt 
medium containing yeast extract, peptone, and gelatin as substrates with a N 2 /CO z gas 
phase. 

OC174V is from the genus Thermotoga. The organism was isolated from 
Yellowstone National Park. It grows optimally at 75 °C in a low salt medium with cellulose 
as a substrate and N 2 in gas phase. 

Pyrococcus furiosus VC1 and (7EG1) is from the genus Pyrococcus. VCl was 

isolated from Vulcano, Italy. It grows optimally at 100°C in a high salt medium (marine) 
containing elemental sulfur, yeast extract, peptone and starch as substrates and N : in gas 
phase. 

Staphylothermus marinus Fl is a from the genus Staphylothermus. Fl was isolated 
from Vulcano, Italy. It grows optimally at 85 °C, pH 6.5 in high salt medium (marine) 
containing elemental sulfur and yeast extract as substrates and N : in gas phase. 
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Thermococcus 9N-2 is from the genus Thermococcus 9N-2 was isolated from 
diffuse vent fluid in the East Pacific Rise. It is a strict anaerobe that grows optimally at 
87°C 

Thermotoga maritima MSB8 and MSB8 (Clone # 6GP2 and 6GB4) is from the 
genus Thermotogo, and was isolated from Vulcano, Italy. MSB8 grows optimally at 85 °C, 
pH 6.5 in a high salt medium (marine) containing starch and yeast extract as substrates and 
N 3 in gas phase. 

Thermococcus alcaliphilus AEDII12RA is from the genus Thermococcus. 
AEDII12RA grows optimally at 85 °C, pH 9.5 in a high salt medium (marine) containing 
polysulfides and yeast extract as substrates and N 2 in gas phase. 

Thermococcus chitonophagus GC74 is from the genus Thermococcus. GC74 grows 
optimally at 85 °C, pH 6.0 in a high salt medium (marine) containing chitin, meat extract, 
elemental sulfur and yeast extract as substrates and N 2 in gas phase. AEPII la grows 
optimally at 85 °C at pH 6.5 in marine medium under anaerobic conditions. It has many 
substrates. Bankia gouldi is from the genus Bankia. 

Accordingly, the polynucleotides and enzymes encoded thereby are identified by 
the organism from which they were isolated, and are sometimes hereinafter referred to as 
"Ml 1TL" (Figure 1 and SEQ ID NOS:l and 15), "OC1/4V-33B/G" (Figure 2 and SEQ ID 
NOS:2 and 16), "F1-12G" (Figure 3 and SEQ ID NOS:3 and 17), "9N2-31B/G" (Figure 4 
andSEQE)NOS:4and 18), "MSB8" (Figure 5 and SEQ IDNOS:5 and 19), "AEDII12RA- 
18B/G" (Figure 6 and SEQ ID NOS:6 and 20), "GC74-22G** (Figure 7 and SEQ ID NOS:7 
and 21), "VC1-7GP (Figure 8 and SEQ ID NOS:8 and 22), "37GPl n (Figure 9 and SEQ 
ID NOS: 9 and 23), M 6GC2' , (Figure 10 and SEQ ID NOS: 10 and 24), ,f 6GP2 M (Figure 1 1 
and SEQ ID NOS:ll and 25), "AEPII la" (Figure 12 and SEQ ID NOS:12 and 26), 
"OC1/4V" (Figure 13 and SEQ ID NOS:13 and 27), and "6GP3" (Figure 14 and SEQ ID 
NOS:28), "MSB8-6GP2 M (Figure 15 and SEQ ID NOS:57 and 61). M MSB8-6GB4 n (Figure 
16 and SEQ ID NOS:58 and 62),"VC1^7EGl n (Figure 17 and SEQ ID NOS:59 and 63), 
and 37GP4 (Figure 18 and SEQ ID NOS:60 and 64). 
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The polynucleotides and polypeptides of the present invention show identity at the 
nucleotide and protein level to known genes and proteins encoded thereby as shown in 
Table 1. 

Table! 



Clone 


Gene/Protein with 
Closest Homology 


Protein 
Identity 


in ucieic 

Acid 
Identity 


N A 7 1 TT OOP 

Mill 


OUliOlODUS SUllaldnCUS 

DSM 1616/Pl.p- 


J 1 /o 




OC1/4V-33B/G 


Caldocellum 

c n f r* h n rn 1 vt i p 1 1 m R - 

glucosidase 


52% 


57% 


Staphyl o th ermus 

mnri n? Fl -1 "^Ct 

tnm i fin j a i i«v_j 


Bacillus polymyxa, p- 
calactosidase 


36% 


48% 


Thermococcus 9N2- 
31B/G 


Sulfolobus sulfataricus 
ATCC 49255/MT4, p- 
ealactosidase 


51% 


50% 


Thermotoga maritima 
MSB8-6G 


Clostridium thermocellum 
bRlB 


45% 


53% 


Thermococcus 
AEDII12RA-18B/G 


Bacillus polymyxa, P- 
galactosidase 


34% 


48% 


Thermococcus 
chitonophagus GC 74- 
22G 


Sulfolobus sulfataricus 
ATCC 49255/MT4, P- 
salactosidase 


46% 


54% 



'0 
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Pyrococcus furiosus 
VC1-7G1 


OlillUlUUUO 

sulfataricus/MT-4 (3- 
aatartn^idase 


46.4% 


52.5% 


Thermotoga maritima 
c -galactosidase 
(6GC2) 


Pediococcus rentosaceaus 


49% 


29% 


Thermotoga maritima 
6-mannanase f6GP2) 


Aspergillus aculeatus 
mannanase 


56% 


37% 


AEPII laB- 

mannosidase C63GB1) 


Sulfolobus solfactaricus 13- 
galactosidase 


78% 


56% 


OC1/4V 

endoglucanase 

(33GP1) 


Clostridium thermocellum 
enao- 1 ,4-lj-enaogiuLdndi>e 


65% 


43% 


Thermotoga maritima 
pullalanase (6GP3) 


Caldocellum 
saccharolyticum oc- 
destrom 6 
glucanohydralase 


72 


53 


Bankia gouldi mix 

Endoglucanase 

(37GP1) 


None available 







The polynucleotides and enzymes of the present invention show homology 
other as shown in Table 2. 



WO 98/24799 



PCTYUS97/22623 



Table 2 



Clone 


Gene/Protein with 
Closest Homology 


Protein 
Identity 


Nucleic 

Acid 
Identity 


Staphylothermus 
marinus F1-12G 


Thermococcus 
AEDII12RA-18B/G, p- 
galactosidase, glucosidase 


55% 


57% 


Thermococcus 9N2- 
31B/G 


Thermococcus 
chitonophagus GC74- 
22G-glucosidase* 


74% 


66% 


Pyro coccus furiosus 
VCI-7G1 


Pyrococcus furiosus VC 1 - 
7B/G (3-galactosidase 


46.4% . 


54% 



All the clones identified in Tables 1 and 2 encode polypeptides which have a- 
glycosidase or P-glycosidase activity. 

This invention, in addition to the isolated nucleic acid molecules encoding the 
enzymes of the present invention, also provide substantially similar sequences. Isolated 
nucleic acid sequences are substantially similar if: (i) they are capable of hybridizing under 
conditions hereinafter described, to the polynucleotides of SEQ ID NOS: 1-14 and 57-60; 
(ii) or they encode DNA sequences which are degenerate to the polynucleotides of SEQ ID 
NOS: 1-14 and 57-60. Degenerate DNA sequences encode the amino acid sequences of 
SEQ ID NOS:15-28 and 61-64, but have variations in the nucleotide coding sequences. As 
used herein, substantially similar refers to the sequences having similar identity to the 
sequences of the instant invention. The nucleotide sequences that are substantially the same 
can be identified by hybridization or by sequence comparison. Enzyme sequences that are 
substantially the same can be identified by one or more of the following: proteolytic 
digestion, gel electrophoresis and/or microsequencing. 

One means for isolating the nucleic acid molecules encoding the enzymes of the 
present invention is to probe a gene library with a natural or artificially designed probe 
using art recognized procedures (see, for example: Current Protocols in Molecular Biology, 
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Ausubel F.M. etai (EDS.) Green Publishing Company Assoc. and John Wiley Interscience, 
New York, 1989. 1992). It is appreciated to one skilled in the art that the polynucleotides 
of SEQ ID NOS: 1-14 and 57-60 or fragments thereof (comprising at least 12 contiguous 
nucleotides), are particularly useful probes. Other particular useful probes for this purpose 
are hybridizable fragments to the sequences of SEQ ID NOS: 1-14 and 57-60 (/.<?., 
comprising at least 12 contiguous nucleotides). 

With respect to nucleic acid sequences which hybridize to specific nucleic acid 
sequences disclosed herein, hybridization may be carried out under conditions of reduced 
stringency, medium stringency or even stringent conditions. As an example of 
oligonucleotide hybridization, a polymer membrane containing immobilized denatured 
nucleic acids is first prehybridized for 30 minutes at 45 °C in a solution consisting of 0.9 M 
NaCl. 50 mM NaH : P0 4 , pH 7.0. 5.0 mM NaEDTA, 0.5% SDS, 10X Denhardt's, and 0.5 
mg/ml polyriboadenylic acid. Approximately 2 X 10 7 cpm (specific activity 4-9 X ill 
cpm/ug) of 32 P end-labeled oligonucleotide probe are then added to the solution. After 12- 
16 hours of incubation, the membrane is washed for 30 minutes at room temperature in IX 
SET (1 50 mM NaCl, 20 mM Tris hydrochloride, pH 7.8, 1 mM Na.EDTA) containing 0.5% 
SDS, followed by a 30 minute wash in fresh IX SET at Tm 10°C for the oligonucleotide 
probe. The membrane is then exposed to autoradiographic film for detection of 
hybridization signals. 

Stringent conditions means hybridization will occur only if there is at least 90% 
identity, preferably at least 95% identity and most preferably at least 97% identity between 
the sequences. Further, it is understood that a section of a 100 bps sequence that is 95 bps 
in length has 95% identity with the 1090 bps sequence from which it is obtained. See J. 
Sambrook et al. Molecular Cloning, A Laboratory Manual 2d Ed, Cold Spring Harbor 
Laboratory (1989) which is hereby incorporated by reference in its entirety. Also, it is 
understood that a fragment of a 100 bps sequence that is 95 bps in length has 95% identity 
with the 100 bps sequence from which it is obtained. 

As used herein, a first DNA (RNA) sequence is at least 70% and preferably at least 
80% identical to another DNA (RNA) sequence if there is at least 70% and preferably at 
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least a 80% or 90% identity, respectively, between the bases of the first sequence and the 
bases of the another sequence, when properly aligned with each other, for example when 
aligned by BLASTN. 

"Identity" as the term is used herein, refers to a polynucleotide sequence which 
comprises a percentage of the same bases as a reference polynucleotide (SEQ ID NOS: 1-14 
and 57-60). For example, a polynucleotide which is at least 90% identical to a reference 
polynucleotide, has polynucleotide bases which are identical in 90% of the bases which 
make up the reference polynucleotide and may have different bases in 10% of the bases 
which comprise that polynucleotide sequence. 

The present invention relates polynucleotides which differ from the reference 
polynucleotide such that the changes are silent changes, for example the change do not alter 
the amino acid sequence encoded by the polynucleotide. The present invention also relates 
to nucleotide changes which result in amino acid substitutions, additions, deletions, fusions 
and truncations in the polypeptide encoded by the reference polynucleotide. In a preferred 
aspect of the invention these polypeptides retain the same biological action as the 
polypeptide encoded by the reference polynucleotide. 

It is also appreciated that such probes can be and are preferably labeled with an 
analytically detectable reagent to facilitate identification of the probe. Useful reagents 
include but are not limited to radioactivity, fluorescent dyes or enzymes capable of 
catalyzing the formation of a detectable product. The probes are thus useful to isolate 
complementary copies of DNA from other sources or to screen such sources for related 
sequences. 

The polynucleotides of this invention were recovered from genomic gene libraries 
from the organisms listed in Table 1. For example, gene libraries can be generated in the 
Lambda ZAP II cloning vector (Stratagene Cloning Systems). Mass excisions can be 
performed on these libraries to generate libraries in the pBluescript phagemid. Libraries 
are thus generated and excisions performed according to the protocols/methods hereinafter 
described. 
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The excision libraries are introduced into the E. coli strain BW 14893 F'kanlA. 
Expression clones are then identified using a high temperature filter assay. Expression 
clones encoding several glucanases and several other glycosidases are identified and 
repurified. The polynucleotides, and enzymes encoded thereby, of the present invention, 
yield the activities as described above. 

The coding sequences for the enzymes of the present invention were identified by 
screening the genomic DNAs prepared for the clones having glucosidase or galactosidase 
activity. 

An example of such an assay is a high temperature filter assay wherein expression 
clones were identified by use of high temperature filter assays using buffer Z (see recipe 
below) containing 1 mg/ml of the substrate 5-bromo-4-chloro-3-indolyl-P-D- 
glucopyranoside (XGLU) (Diagnostic Chemicals Limited or Sigma) after introducing an 
excision library into the E coli strain BW14893 F'kanlA. Expression clones encoding 
XGLUases were identified and repurified from Ml 1TL, OC1/4V, Pyrococcus furiosus VC1, 
Staphylothemus marinus Fl, Thermococcus 9N-2, Thermotoga maritima MSB8, 
Thermococcus alcaliphilus AEDII12RA, and Thermococcus chitonophagus GC74. 

Z-buffer: (referenced in Miller, J.H. (1992) A Short Course in Bacterial Genetics, 

p. 445.) 

per liter: 

Na : HPO r 7H 2 0 16.1g 
NaH 2 PO r 7H,0 5.5g I 

KC1 °- 75 e 
MgS0 4 -7H 2 0 0-246g 
P-mercaptoethanol 2.7ml 
Adjust pH to 7.0 

High Tem perature Filter Assay 
( 1 ) The f factor f kan (from E. coli strain CSH 1 1 8)(1 ) was introduced into the pho-pnh- 
lac-strain BW14893(2). BW13893(2). The filamentous phage library was plated 
on the resulting strain, BW14893 F'kan. (Miller, J.H. (1992) A Short Course in 
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Bacterial Genetics; Lee, ICS., Metcalf, et al., (1992) Evidence for two phosphonate 
degradative pathways in Enterobacter Aerogenes, J. Bacterid., 1 74:250 1 -25 1 0. 

(2) After growth on 100 mm LB plates containing 100 (ig/ml ampicillin, 80 ng/ml 
nethicillin and ImM IPTG, colony lifts were performed using Millipore HATF 
membrane filters. 

(3) The colonies transferred to the filters were lysed with chloroform vapor in 150 mm 
glass petri dishes. 

(4) The filters were transferred to 100 mm glass petri dishes containing a piece of 
Whatman 3 MM filter paper saturated with buffer. 

(a) when testing for galactosidase activity (XGALase), 3 MM paper was 
saturated with Z buffer containing 1 mg/ml XGAL (ChemBridge 
Corporation). After transferring filter bearing lysed colonies to the glass 
petri dish, placed dish in oven at 80-85 °C. 

(b) when testing for glucosidase (XGLUase), 3 MM paper was saturated 
with Z buffer containing 1 mg/ml XGLU. After transferring filter bearing 
lysed colonies to the glass petri dish, placed dish in oven at 80-85 °C. 

(5) 'Positives' were observed as blue spots on the filter membranes. Used the following 
filter rescue technique to retrieve plasmid from lysed positive colony. Used pasteur 
pipette (or glass capillary tube) to core blue spots on the filter membrane. Placed 
the small filter disk in an Eppendorf tube containing 20 jal water. Incubated the 
Eppendorf tube at 75 °C for 5 minutes followed by vortexing to elute plasmid DNA 
off filter. This DNA was transformed into electrocompetent E. coli cells DH10B 
for Thermatoga maritima MSB8-6G, Staphylothermus marinus F1-12G, 
Thermococcus AEDII12RA-18B/G, Thermococcus chitonophagus GC74-22G, 
Ml 1T1 and OC1/4V. Electrocompetent BW14893 F'kanl A E. coli were used for 
Thermococcus 9N2-31B/G, and Pyrococcus furiosus VC1-7G1. Repeated filter-lift 
assay on transformation plates to identify 'positives'. Return transformation plates 
to 37 °C incubator after filter lift to regenerate colonies. Inoculate 3 ml LB liquid 
containing 100 jig/ml ampicillin with repurified positives and incubate at 37°C 
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overnight. Isolate plasmid DNA from these cultures and sequence plasmid insert. 
In some instances where the plates used for the initial colony lifts contained non- 
confluent colonies, a specific colony corresponding to a blue spot on the filter could 
be identified on a regenerated plate and repurified directly, instead of using the filter 
rescue technique. 

Another example of such an assay is a variation of the high temperature filter assay 
wherein colony-laden filters are heat-killed at different temperatures (for example, 105°C 
for 20 minutes) to monitor thermostability. The 3 MM paper is saturated with different 
buffers (i.e., 1 00 mM NaCl, 5 mM MgCK, 1 00 mM Tris-Cl (pH 9.5)) to determine enzyme 
activity under different buffer conditions. 

A p-glucosidase assay may also be employed, wherein GlcppNp is used as an 
artificial substrate (aryl-P-glucosidase). The increase in absorbance at 405 nm as a result 
of p-nitrophenoi (pNp) liberation was followed on a Hitachi U-l 100 spectrophotometer, 
equipped with a thermostatted cuvette holder. The assays may be performed at 80 °C or 
9Q°C in closed 1-ml quartz cuvette. A standard reaction mixture contains 150 mM 
trisodium substrate. pH 5.0 (at 80°C), and 0.95 mM pNp derivative pNp = 0.561 mM" 1 cm' 
l ). The reaction mixture is allowed to reach the desired temperature, after which the 
reaction is started by injecting an appropriate amount of enzyme (1.06 ml final volume). 

1 U P-glucosidase activity is defined as that amount required to catalyze the 
formation of 1 .0 jumol pNp/min. D-cellobiose may also be used as a substrate. 

An ONPG assay for p-galactosidase activity is described by Miller, J.H. (1992) A 
Short Course in Bacterial Genetics and Mill, J.H. (1992) Experiments in Molecular 
Genetics, the contents of which are hereby incorporated by reference in their entirety. 

A quantitative fluorometric assay for p-galactosidase specific activity is described 
by : Youngman P., (1987) Plasmid Vectors for Recovering and Exploiting Tn917 
Transpositions in Bacillus and other Gram-Positive Bacteria. In Plasmids: A Practical 
approach (ed. fC Hardy) pp 79-103. IRL Press, Oxford. A description of the procedure can 
be found in Miller (1992) p. 75-77, the contents of which are incorporated by reference 
herein in their entirety. 
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The polynucleotides of the present invention may be in the form of DNA which 
DNA includes cDNA, genomic DNA, and synthetic DNA. The DNA may be double- 
stranded or single-stranded, and if single stranded may be the coding strand or non-coding 
(anti-sense) strand. The coding sequences which encodes the mature enzymes may be 
identical to the coding sequences shown in Figures 1-8 (SEQ ID NOS: 1-14 and 57-60) or 
may be a different coding sequence which coding sequence, as a result of the redundancy 
or degeneracy of the genetic code, encodes the same mature enzymes as the DNA of 
Figures 1-18 (SEQ ID NOS: 1-14 and 57-60). 

The polynucleotide which encodes for the mature enzyme of Figures 1-18 (SEQ ID 
NOS: 15-28 and 61-64) may include, but is not limited to: only the coding sequence for the 
mature enzyme; the coding sequence for the mature enzyme and additional coding sequence 
such as a leader sequence or a proprotein sequence; the coding sequence for the mature 
enzyme (and optionally additional coding sequence) and non-coding sequence, such as 
introns or non-coding sequence 5' and/or 3' of the coding sequence for the mature enzyme. 

Thus, the term "polynucleotide encoding an enzyme (protein)" encompasses a 
polynucleotide which includes only coding sequence for the enzyme as well as a 
polynucleotide which includes additional coding and/or non-coding sequence. 

The present invention further relates to variants of the hereinabove described 
polynucleotides which encode for fragments, analogs and derivatives of the enzymes having 
the deduced amino acid sequences of Figures 1-18 (SEQ ID NOS: 15-28 and 61-64). The 
variant of the polynucleotide may be a naturally occurring allelic variant of the 
polynucleotide or a non-naturally occurring variant of the polynucleotide. 

Thus, the present invention includes polynucleotides encoding the same mature 
enzymes as shown in Figures 1-18 (SEQ ID NOS: 15-28 and 61-64) as well as variants of 
such polynucleotides which variants encode for a fragment, derivative or analog of the 
enzymes of Figures 1-18 (SEQ ID NOS: 15-28 and 61-64). Such nucleotide variants 
include deletion variants, substitution variants and addition or insertion variants. 

As hereinabove indicated, the polynucleotides may have a coding sequence which 
is a naturally occurring allelic variant of the coding sequences shown in Figures 1-18 (SEQ 
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ID NOS: 1-14 and 57-60). As known in the art, an allelic variant is an alternate form of a 
polynucleotide sequence which may have a substitution, deletion or addition of one or more 
nucleotides, which does not substantially alter thf function of the encoded enzyme. 

Fragments of the full length gene of the present invention may be used as a 
hybridization probe for a cDNA or a genomic library to isolate the full length PNA and to 
isolate other DNAs which have a high sequence similarity to the gene or similar biological 
activity. Probes of this type preferably have at least 10, preferably at least 15, and even 
more preferably at least 30 bases and may contain, for example, at least 50 or more bases. 
The probe may also be used to identify a DNA clone corresponding to a full length 
transcript and a genomic clone or clones that contain the complete gene including 
regulatory and promoter regions, exons, and introns. An example of a screen comprises 
isolating the coding region of the gene by using the known DNA sequence to synthesize an 
olisonucleotide probe. Labeled oligonucleotides having a sequence complementary to that 
of the gene of the present invention are used to screen a library of genomic DNA to 
determine which members of the library the probe hybridizes to. 

The present invention further relates to polynucleotides which hybridize to the 
hereinabove-described sequences if there is at least 70%, preferably at least 90%, and more 
preferably at least 95% identity between the sequences. The present invention particularly 
relates to polynucleotides which hybridize under stringent conditions to the hereinabove- 
described polynucleotides. As herein used, the term "stringent conditions" means 
hybridization will occur only if there is at least 95% and preferably at least 97% identity 
between the sequences. The polynucleotides which hybridize to the hereinabove described 
polynucleotides in a preferred embodiment encode enzymes which either retain 
substantially the same biological function or activity as the mature enzyme encoded by the 
DNA of Figures 1-18 (SEQ ID NOS: 1-14 and 57-60). 

Alternatively, the polynucleotide may have at least 15 bases, preferably at least 30 
bases, and more preferably at least 50 bases which hybridize to any part of a polynucleotide 
of the present invention and which has an identity thereto, as hereinabove described, and 
which may or may not retain activity. For example, such polynucleotides may be employed 
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as probes for the polynucleotides of SEQ ID NOS: 1-14 and 57-60, for example, for 
recovery of the polynucleotide or as a diagnostic probe or as a PCR primer. 

Thus, the present invention is directed to polynucleotides having at least a 70% 
identity, preferably at least 90% identity and more preferably at least a 95% identity to a 
polynucleotide which encodes the enzymes of SEQ ID NOS: 15-28 and 61-64 as well as 
fragments thereof which fragments have at least 1 5 bases, preferably at least 30 bases and 
most preferably at least 50 bases, which fragments are at least 90% identical, preferably at 
least 95% identical and most preferably at least 97% identical under stringent conditions 
to any portion of a polynucleotide of the present invention. 

The present invention further relates to enzymes which have the deduced amino acid 
sequences of Figures 1-18 (SEQ ID NOS: 15-28 and 61-64) as well as fragments, analogs 
and derivatives of such enzyme. 

The terms "fragment," "derivative" and "analog" when referring to the enzymes of 
Figures 1-18 (SEQ ID NOS: 15-28 and 61-64) means enzymes which retain essentially the 
same biological function or activity as such enzymes. Thus, an analog includes a proprotein 
which can be activated by cleavage of the proprotein portion to produce an active mature 
enzyme. 

The enzymes of the present invention may be a recombinant enzyme, a natural 
enzyme or a synthetic enzyme, preferably a recombinant enzyme. 

The fragment, derivative or analog of the enzymes of Figures 1-18 (SEQ ID NOS: 
15-28 and 61-64) may be (i) one in which one or more of the amino acid residues are 
substituted with a conserved or non-conserved amino acid residue (preferably a conserved 
amino acid residue) and such substituted amino acid residue may or may not be one 
encoded by the genetic code, or (ii) one in which one or more of the amino acid residues 
includes a substituent group, or (iii) one in which the mature enzyme is fused with another 
compound, such as a compound to increase the half-life of the enzyme (for example, 
polyethylene glycol), or (iv) one in which the additional amino acids are fused to the mature 
enzyme, such as a leader or secretory sequence or a sequence which is employed for 
purification of the mature enzyme or a proprotein sequence. Such fragments, derivatives 
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and analogs are deemed to be within the scope of those skilled in the art from the teachings 
herein. 

The enzymes and polynucleotides of the present invention are preferably provided 
in an isolated form, and preferably are purified to homogeneity. 

The term "isolated" means that the material is removed from its original 
environment (e.g., the natural environment if it is naturally occurring). For example, a 
naturally-occurring polynucleotide or enzyme present in a living animal is not isolated, but 
the same polynucleotide or enzyme, separated from some or all of the coexisting materials 
in the natural system, is isolated. Such polynucleotides could be part of a vector and/or 
such polynucleotides or enzymes could be part of a composition, and still be isolated in that 
such vector or composition is not part of its natural environment. 

The enzymes of the present invention include the enzymes of SEQ ID NOS: 15-28 
and 6 1 -64 (in particular the mature enzyme) as well as enzymes which have at least 70% 
similarity (preferably at least 70% identity) to the enzymes of SEQ ID NOS : 1 5-28 and 6 1 - 
64 and more preferably at least 90% similarity (more preferably at least 90% identity) to 
the enzymes of SEQ ID NOS: 15-28 and 61-64 and still more preferably at least 95% 
similarity (still more preferably at least 95% identity) to the enzymes of SEQ ID NOS: 15- 
28 and 61-64 and also include portions of such enzymes with such portion of the enzyme 
generally containing at least 30 amino acids and more preferably at least 50 amino acids. 

As known in the art "similarity" between two enzymes is determined by comparing 
the amino acid sequence and its conserved amino acid substitutes of one enzyme to the 
sequence of a second enzyme. 

A variant, i.e. a "fragment", "analog" or "derivative" polypeptide., and reference 
polypeptide may differ in amino acid sequence by one or more substitutions, additions, 
deletions, fusions and truncations, which may be present in any combination. 

Among preferred variants are those that vary from a reference by conservative 
amino acid substitutions. Such substitutions are those that substitute a given amino acid in 
a polypeptide by another amino acid of like characteristics. Typically seen as conservative 
substitutions are the replacements, one for another, among the aliphatic amino acids Ala, 
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Val, Leu and He; interchange of the hydroxyl residues Ser and Thr, exchange of the acidic 
residues Asp and Glu, substitution between the amide residues Asn and Gln T exchange of 
the basic residues Lys and Arg and replacements among the aromatic residues Phe, Tyr. 

Most highly preferred are variants which retain the same biological functic : and 
activity as the reference polypeptide from which it varies. 

Fragments or portions of the enzymes of the present invention may be employed for 
producing the corresponding full-length enzyme by peptide synthesis; therefore, the 
fragments may be employed as intermediates for producing the full-length enzymes. 
Fragments or portions of the polynucleotides of the present invention may be used to 
synthesize full-length polynucleotides of the present invention. 

The present invention also relates to vectors which include polynucleotides of the 
present invention, host cells which are genetically engineered with vectors of the invention 
and the production of enzymes of the invention by recombinant techniques. 

Host cells are genetically engineered (transduced or transformed or transfected) with 
the vectors of this invention which may be, for example, a cloning vector or an expression 
vector. The vector may be, for example, in the form of a plasmid, a viral particle, a phage, 
etc. The engineered host cells can be cultured in conventional nutrient media modified as 
appropriate for activating promoters, selecting transformants or amplifying the genes of the 
present invention. The culture conditions, such as temperature, pH and the like, are those 
previously used with the host cell selected for expression, and will be apparent to the 
ordinarily skilled artisan. 

The polynucleotides of the present invention may be employed for producing 
enzymes by recombinant techniques. Thus, for example, the polynucleotide may be 
included in any one of a variety of expression vectors for expressing an enzyme. Such 
vectors include chromosomal, nonchromosomal and synthetic DNA sequences, e,g., 
derivatives of SV40; bacterial plasmids; phage DNA; baculovirus; yeast plasmids; vectors 
derived from combinations of plasmids and phage DNA, viral DNA such as vaccinia, 
adenovirus, fowl pox virus, and pseudorabies. However, any other vector may be used as 
long as it is replicable and viable in the host. 
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The appropriate DNA sequence may be inserted into the vector by a variety of 
procedures. In general, the DNA sequence is inserted into an appropriate restriction 
endonuclease site(s) by procedures known in the art. Such procedures and others are 
deemed to be within the scope of those skilled in the art. 

The DNA sequence in the expression vector is operatively linked to an appropriate 
expression control sequence(s) (promoter) to direct mRNA synthesis. As representative 
examples of such promoters, there may be mentioned: LTR or SV40 promoter, the E. coli. 
lac or ire, the phage lambda P L promoter and other promoters known to control expression 
of genes in prokaryotic or eukaryotic cells or their viruses. The expression vector also 
contains a ribosome binding site for translation initiation and a transcription terminator. 
The vector may also include appropriate sequences for amplifying expression. 

In addition, the expression vectors preferably contain one or more selectable marker 
fienes to provide a phenotypic trait for selection of transformed host cells such as 
dihydro folate reductase or neomycin resistance for eukaryotic cell culture, or such as 
tetracycline or ampicillin resistance in E. coli . 

The vector containing the appropriate DNA sequence as hereinabove described, as 
well as an appropriate promoter or control sequence, may be employed to transform an 
appropriate host to permit the host to express the protein. 

As representative examples of appropriate hosts, there may be mentioned: bacterial 
cells, such as E. coli . Streptomvces , Bacillus subtilis ; fungal cells, such as yeast: insect cells 
such as Drosophila S2 and Spodoptera Sf9; animal cells such as CHO, COS or Bowes 
melanoma; adenoviruses; plant cells, etc. The selection of an appropriate host is deemed 
to be within the scope of those skilled in the art from the teachings herein. 

More particularly, the present invention -also includes recombinant constructs 
comprising one or more of the sequences as broadly described above. The constructs 
comprise a vector, such as a plasmid or viral vector, into which a sequence of the invention 
has been inserted, in a forward or reverse orientation. In a preferred aspect of this 
embodiment, the construct further comprises regulatory sequences, including, for example, 
a promoter, operably linked to the sequence. Large numbers of suitable vectors and 
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promoters are known to those of skill in the art. and are commercially available. The 
following vectors are provided by way of example; Bacterial: pQE70, pQE60, pQE-9 
(Qiagen), pDIO, psiX174, pBluescnpt 11 KS, pNH8A, pNH16a, pNH18A, pNH46A 
(Stratagene); ptrc99a, pKK223-3, pKK233-3, pDR540, pRJT5 (Pharmacia); Eukaryotic: 
pSV2CAT, pOG44, pXTl, pSG (Stratagene) pSVK3, pBPV, pMSG, pSVL (Pharmacia). 
However, any other plasmid or vector may be used as long as they are replicable and viable 
in the host. 

Promoter regions can be selected from any desired gene using CAT 
(chloramphenicol transferase) vectors or other vectors with selectable markers. Two 
appropriate vectors are pKK232-8 and pCM7. Particular named bacterial promoters include 
lacL lacZ, T3, T7, gpt, lambda P R , P L and trp. Eukaryotic promoters include CMV 
immediate early, HSV thymidine kinase, early and late SV40, LTRs from retrovirus, and 
mouse metallothionein-I. Selection of the appropriate vector and promoter is well within 
the level of ordinary skill in the art. 

In a further embodiment, the present invention relates to host cells containing the 
above-described constructs. The host cell can be a higher eukaryotic cell, such as a 
mammalian cell, or a lower eukaryotic ceil, such as a yeast cell, or the host cell can be a 
prokaryotic cell, such as a bacterial cell Introduction of the construct into the host cell can 
be effected by calcium phosphate transfection, DEAE-Dextran mediated transfection, or 
electroporation (Davis, U Dibner. M, Battey, L Basic Methods in Molecular Biology, 
(1986)). 

The constructs in host cells can be used in a conventional manner to produce the 
gene product encoded by the recombinant sequence. Alternatively, the enzymes of the 
invention can be synthetically produced by conventional peptide synthesizers. 

Mature proteins can be expressed in mammalian cells, yeast ? bacteria, or other cells 
under the control of appropriate promoters. Cell-free translation systems can also be 
employed to produce such proteins using RNAs derived from the DNA constructs of the 
present invention. Appropriate cloning and expression vectors for use with prokaryotic and 
eukaryotic hosts are described by Sambrook, et aL Molecular Cloning: A Laboratory 
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Manual, Second Edition, Cold Spring Harbor, N.Y., (1989), the disclosure of which is 
hereby incorporated by reference. 

Transcription of the DNA encoding the enzymes of the present invention by higher 
eukaryotes is increased by inserting an enhancer sequence into the vector. Enhance., are 
cis-acting elements of DNA, usually about from 10 to 300 bp that act on a promoter to 
increase its transcription. Examples include the SV40 enhancer on the late side of the 
replication origin bp 100 to 270, a cytomegalovirus early promoter enhancer, the polyoma 
enhancer on the late side of the replication origin, and adenovirus enhancers. 

Generally, recombinant expression vectors will include origins of replication and 
selectable markers permitting transformation of the host cell, e.g., the ampicillin resistance 
gene of E. coli and S. cerevisiae TRP1 gene, and a promoter derived from a highly- 
expressed gene to direct transcription of a downstream structural sequence. Such promoters 
can be derived from operons encoding glycolytic enzymes such as 3-phosphoglycerate 
kinase (PGK), a-factor, acid phosphatase, or heat shock proteins, among others. The 
heterologous structural sequence is assembled in appropriate phase with translation 
initiation and termination sequences, and preferably, a leader sequence capable of directing 
secretion of translated enzyme. Optionally, the heterologous sequence can encode a fusion 
enzyme including an N-terminal identification peptide imparting desired characteristics, 
e.g., stabilization or simplified purification of expressed recombinant product. 

Useful expression vectors for bacterial use are constructed by inserting a structural 
DNA sequence encoding a desired protein together with suitable translation initiation and 
termination signals in operable reading phase with a functional promoter. The vector will 
comprise one or more phenotypic selectable markers and an origin of replication to ensure 
maintenance of the vector and to, if desirable, provide amplification within the host. 
Suitable prokaryotic hosts for transformation include E. coli, Bacillus subtilis , S almonella 
tvphimurium and various species within the genera Pseudomonas, Streptomyces, and 
Staphylococcus, although others may also be employed as a matter of choice. 

As a representative but nonlimiting example, useful expression vectors for bacterial 
use can comprise a selectable marker and bacterial origin of replication derived from 
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commercially available plasmids comprising genetic elements of the well known cloning 
vector pBR322 (ATCC 37017). Such commercial vectors include, for example, pKK223-3 
(Pharmacia Fine Chemicals, Uppsala, Sweden) and GEM1 (Promega Biotec, Madison, WI, 
USA). These pBR322 "backbone" sections are combined with an appropriate promoter and 
the structural sequence to be expressed. 

Following transformation of a suitable host strain and growth of the host strain to 
an appropriate cell density, the selected promoter is induced by appropriate means (e.g., 
temperature shift or chemical induction) and cells are cultured for an additional period. 

Cells are typically harvested by centrifugation, disrupted by physical or chemical 
means, and the resulting crude extract retained for further purification. 

Microbial cells employed in expression of proteins can be disrupted by any 
convenient method, including freeze-thaw cycling, sonication, mechanical disruption, or 
use of cell lysing agents, such methods are well known to those skilled in the art. 

Various mammalian cell culture systems can also be employed to express 
recombinant protein. Examples of mammalian expression systems include the COS-7 lines 
of monkey kidney fibroblasts, described by Gluzman, Cell, 23:175 (1981), and other cell 
lines capable of expressing a compatible vector, for example, the CI 27, 3T3, CHO, HeLa 
and BHK cell lines. Mammalian expression vectors will comprise an origin of replication, 
a suitable promoter and enhancer, and also any necessary ribosome binding sites, 
polyadenvlation site, splice donor and acceptor sites, transcriptional termination sequences, 
and 5' flanking nontranscribed sequences. DNA sequences derived from the SV40 splice, 
and polyadenvlation sites may be used to provide the required nontranscribed genetic 
elements. 

The enzyme can be recovered and purified from recombinant cell cultures by 
methods including ammonium sulfate or ethanol precipitation, acid extraction, anion or 
cation exchange chromatography, phosphocellulose chromatography, hydrophobic 
interaction chromatography, affinity chromatography, hydroxylapatite chromatography and 
lectin chromatography. Protein refolding steps can be used, as necessary, in completing 
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configuration of the mature protein. Finally, high performance liquid chromatography 
(HPLC) can be employed for final purification steps. 

The enzymes of the present invention may be a naturally purified product, or a 
product of chemical synthetic procedures, or pioduced by recombinant techniques from a 
prokaryotic or eukaryotic host (for example, by bacterial, yeast, higher plant, insect and 
mammalian cells in culture). Depending upon the host employed in a recombinant 
production procedure, the enzymes of the present invention may be glycosylated or may be 
non-glycosylated. Enzymes of the invention may or may not also include an initial 
methionine amino acid residue. 

p-galactosidase hydrolyzes lactose to galactose and glucose. Accordingly, the 
OC1/4V, 9N2-31B/G, AEDII12RA-18B/G and F1-12G enzymes may be employed in the 
food processing industry for the production of low lactose content milk and for the 
production of galactose or glucose from lactose contained in whey obtained in a large 
amount as a by-product in the production of cheese. Generally, it is desired that enzymes 
used in food processing, such as the aforementioned p-galactosidases, be stable at elevated 
temperatures to help prevent microbial contamination. 

These enzymes may also be employed in the pharmaceutical industry. The enzymes 
are used to treat intolerance to lactose. In this case, a thermostable enzyme is desired, as 
well. Thermostable p-galactosidases also have uses in diagnostic applications, where they 
are employed as reporter molecules. 

Glucosidases act on soluble cellooligosacchandes from the non-reducing end to give 
glucose as the sole product. Glucanases (endo- and exo-) act in the depolymerization of 
cellulose, generating more non-reducing ends (endo-glucanases, for instance, act on internal 
linkages yielding cellobiose, glucose and cellooligosacchandes as products). P- 
glucosidases are used in applications where glucose is the desired product. Accordingly, 
M11TL, F1-12G, GC74-22G, MSB8-6G , OC1/4V, VC1-7G1, 9N2-31B/G and 
AEDII12RA18B/G may be employed in a wide variety of industrial applications, including 
in corn wet milling for the separation of starch and gluten, in the fruit industry for 
clarification and equipment maintenance, in baking for viscosity reduction, in the textile 
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industry for the processing of blue jeans, and in the detergent industry as an additive. For 
these and other applications, thermostable enzymes are desirable. 

Antibodies generated against the enzymes corresponding to a sequence of the 
present invention can be obtained by direct injection of the enzymes into an animal or by 
administering the enzymes to an animal, preferably a nonhuman. The antibody so obtained 
will then bind the enzymes itself. In this manner, even a sequence encoding only a 
fragment of the enzymes can be used to generate antibodies binding the whole native 
enzymes. Such antibodies can then be used to isolate the enzyme from cells expressing that 
enzyme. 

For preparation of monoclonal antibodies, any technique which provides antibodies 
produced by continuous cell line cultures can be used. Examples include the hybridoma 
technique (Kohler and Milstein, 1975, Nature, 256:495-497), the trioma technique, the 
human B-cell hybridoma technique (Kozbor et al., 1983, Immunology Today 4:72), and the 
EBV-hybridoma technique to produce human monoclonal antibodies (Cole, et al., 1985, in 
Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). 

Techniques described for the production of single chain antibodies (U.S. Patent 
4,946,778) can be adapted to produce single chain antibodies to immunogenic enzyme 
products of this invention. Also, transgenic mice may be used to express humanized 
antibodies to immunogenic enzyme products of this invention. 

Antibodies generated against the enzyme of the present invention may be used in 
screening for similar enzymes from other organisms and samples. Such screening 
techniques are known in the art, for example, one such screening assay is described in 
"Methods for Measuring Cellulase Activities", Methods in enzymology, Vol 160, pp. 87- 
116, which is hereby incorporated by reference in its entirety. 

The present invention will be further described with reference to the following 
examples; however, it is to be understood that the present invention is not limited to such 
examples. All parts or amounts, unless otherwise specified, are by weight. 

In order to facilitate understanding of the following examples certain frequently 
occurring methods and/or terms will be described. 
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"Plasmids" are designated by a lower case p preceded and/or followed by capital 
letters and/or numbers. The starting plasmids herein are either commercially available, 
publicly available on an unrestricted basis, or can be constructed from available plasmids 
in accord with published procedures. In addition, equivalent plasmids to those described 
are known in the art and will be apparent to the ordinarily skilled artisan. 

"Digestion" of DNA refers to catalytic cleavage of the DNA with a restriction 
enzyme that acts only at certain sequences in the DNA. The various restriction enzymes 
used herein are commercially available and their reaction conditions, cofactors and other 
requirements were used as would be known to the ordinarily skilled artisan. For analytical 
purposes, typically 1 ug of plasmid or DNA fragment is used with about 2 units of enzyme 
in about 20 ui of buffer solution. For the purpose of isolating DNA fragments for plasmid 
construction, typically 5 to 50 ug of DNA are digested with 20 to 250 units of enzyme in 
a larger volume. Appropriate buffers and substrate amounts for particular restriction 
enzymes are specified by the manufacturer. Incubation times of about 1 hour at 37°C are 
ordinarily used, but may vary in accordance with the supplier's instructions. After digestion 
the reaction is electrophoresed directly on a polyacryiamide gel to isolate the desired 
fragment. 

Size separation of the cleaved fragments is performed using 8 percent 
polyacryiamide gel described by Goeddel. D. et ai. Nucleic Acids Res., 8:4057 (1980). 

"Oligonucleotides" refers to either a single stranded polydeoxynucleotide or two 
complementary polydeoxynucleotide strands which may be chemically synthesized. Such 
synthetic oligonucleotides have no 5' phosphate and thus will not ligate to another 
oligonucleotide without adding a phosphate with an ATP in the presence of a kinase. A 
synthetic oligonucleotide will ligate to a fragment that has not been dephosphorylated. 

"Ligation" refers to the process of forming phosphodiester bonds between two 
double stranded nucleic acid fragments (Maniatis, T., et aL Id., p. 146). Unless otherwise 
provided, ligation may be accomplished using known buffers and conditions with 10 units 
of T4 DNA ligase ("ligase") per 0.5 ug of approximately equimolar amounts of the DNA 
fragments to be ligated. 
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Unless otherwise stated, transformation was performed as described in the method 
of Graham, F. and Van der Eb, A., Virology, 52:456-457 (1973). 

Example 1 

Bacterial Expression and Puriiication of GIvcosida se Enzvmes 

DNA encoding the enzymes of the present invention, SEQ ID NOS: 1-14 and 57-60 
were initially amplified from a pBluescript vector containing the DNA by the PCR 
technique using the primers noted herein. The amplified sequences were then inserted into 
the respective PQE vector listed beneath the primer sequences, and the enzyme was 
expressed according to the protocols set forth herein. The 5' and 3' primer sequences for 
the respective genes are as follows: 

Thermococcus AEDII12RA -1 8B/G 

3' CCGAGAATTCATTAAAGAGGAGAAATTAACTATGGTGAATGCTATGATTGTC 3' (SEQ ID NO:29) 
3* CGGAAGATCTTCATAGCTCCGGAACCCCATA 5* (SEQ ID NO:30) 

Vector: pQE12; and contains the following restriction enzyme sites 5' EcoRI and 3' Big 
II. 

OC1/4V-33B/G 

5' CCGAGAATTCATTAAAGAGGAGAAATTAACTATGATAAGAAGGTCCGATTTTCC 3* 
(SEQIDNO:31) 

3' CGGAAGATCTTTAAGATTTTAGAAATTCCTT 5* (SEQ ID NO:32) 

Vector: pQE12; and contains the following restriction enzyme sites 5' EcoRI and 3' Bgl 
II. 

Thermococcus 9N2 - 3 1 B/G 

5' CCGAGAATTCATI'AAAGAGGAGAAA'rTAACTATGCTACCAGAAGGCTTTCTC 3' 
(SEQ IDN0:33) 

3' CGGAGGTACCTCACCCAAGTCCGAACTTCTC 5* (SEQ ID N0.34) 

Vector: pQE30; and contains the following restriction enzyme sites 5' EcoRI and 3' 
KpnI. 
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Siaphyloihermus marinusYX - 12G 

5' CCGAGAATTCATTAAAGAGGAGAAATT/VACTATGATAAGGTTTCCTGATTAT 3' 
(SEQ[DNO:35) 

3' CGGAAGATCTTTATTCGAGGTTCTTTAATCC 5' (SEQ ID NO:36) 

Vector: pQE12; and contains the following restriction enzyme sites 5' EcoRI and 3' Bgl 
II. 

Thermococcus chitonophagus GC74 - 22G 

5' CCGAGAATTCATTCATTAAAGAGGAGAAATTAACTATGCTTCCAGGAGAACTTTCTC 3' 
(SEQIDNO:37) 

3* CGGAGGATCCCTACCCCTCCTCTAAGATCTC 5' (SEQ ID NO:38) 

Vector: pQE12; and contains the following restriction enzyme sites 5' EcoRI and 3' 
BamHI. 

M11TL 

5' AATAATCTAGAGCATGCAATTCCCCAAAGACTTCATGATAG 3' (SEQ ID NO;39) 
3' AATAAAAGCTTACTGGATCAGTGTAAGATGCT 5* (SEQ ID NO:40) 

Vector: pQE70; and contains the following restriction enzyme sites 5* SphI and 3' Hind 
III. 

Thermotoga maritima MSB8-6G 

5' CCGACAATTGATTAAAGAGGAGAAATTAACTATGGAAAGGATCGATGAAATT 3' (SEQ ID NO:41) 
3* CGGAGGTACCTCATGGTTTGAATCTCTTCTC 5' (SEQ ID NO:42) 

Vector: pQE12; and contains the following restriction enzyme sites 5' EcoRI and 3' 
KpnI. 

Pyrococcus fuhosus VC1 - 7G1 

5* CCGACAATTGATTAAAGAGGAGAAATrAACl'ATGTTCCCTGAAAAGTTCCTT 3' (SEQ ID NO:43) 
3* CGGAGGTACCTCATCCCCTCAGCAATTCCTC 5' (SEQ ID NO:44) 

Vector: pQE12; and contains the following restriction enzyme sites 5' EcoRI and 3' Kpn 
L 
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Bankia gouldi endoglucanase (37GP1) 

5 < AATAAGGATCCGTTTAGCGACGCTCGC 3' (SEQ ID NO:45) 

3* AATAAAAG'CTTCCGGGTTGTACAGCGGTAATAGGC 5' (SEQ ID NO:46) 

Vector: pQE52; and contains the following restriction enzyme sites 5' Bam HI and 3' 
Hind III. 

Thermotoga maritima a-galactosidase (6GC2) 

5' TTTATTGAATTCATTAAAGAGGAGAAATTAACTATGATCTGTGTGGAAATATTCGGAAAG 3' 
(SEQ IDNO:47) 

3' TCTATAAAGCTTTCATTCTCTCTCACCCTCTTCGTAGAAG 5' (SEQ ID NO:48) 

Vector: pQET; and contains the following restriction enzyme sites 5' EcoRI and 3' Hind 
III. 

Thermotoga maritima B-mannanase (6GP2) 

5' TTTATTCAATTGATTAAAGAGGAGAAATTAACTATGGGGATTGGTGGCGACGAC 3' 
(SEQ ID NO:49) 

3' TTTATTAAGCTTATCTTTTCATATTCACATACCTCC 5' (SEQ ID NO:50) 

Vector: pQEt; and contains the following restriction enzyme sites 5' Hind III and 3' 
EcoRI. 

AEPIIla B-mannanase (63GB 1) 

5' TTTATTGAATTCATTAAAGAGGAGAAATTAACTATGCTACCAGAAGAGTTCCTATGGGGC 3' 
(SEQ IDNO:51) 

3' TTTATTAAGCTTCTCATCAACGGCTATGGTCTTCATTTC 5* (SEQ ID NO:52) 

Vector: pQEt; and contains the following restriction enzyme sites 5' Hind III and 3' 
EcoRI. 

OC1/4V endoglucanase (33GP1) 

5' AAAAAACAATTGAATTCATTAAAGAGGAGAAATTAACT 
3* (SEQIDNO:53) 

3' TTTTTCGGATCCAATTCTTCATTTACTCTTTGCCTG 5' (SEQ ID NO:54) 
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Vector: pQEt; and contains the following restriction enzyme sites 5' BamHI and 3' 
EcoRI. 

Thermologa maritima pullalanase (6GP3) 

; ' TTTTGGAATTCATTAAAGAGGAGAAATTAACTATGOAACTGATC ATAGAAGGTTAC 3' 
(SEQ ID NO:55) 

3' ATAAGAAGCTTTTCACTCTCTGTACAGAACGTACGC 5' (SEQ ID NO:56) 

Vector: pQEt; and contains the following restriction enzyme sites 5' EcoRI and 3' Hind 

in. 

The restriction enzyme sites indicated correspond to the restriction enzyme sites on 
the bacterial expression vector indicated for the respective gene (Qiagen, Inc. Chatsworth, 
CA). ThepQE vector encodes antibiotic resistance (Amp 1 ), a bacterial origin of replication 
(ori), an IPTG-regulatable promoter operator (P/O), a ribosome binding site (RBS), a 6-His 
tag and restriction enzyme sites. 

The pQE vector was digested with the restriction enzymes indicated. The amplified 
sequences were ligated into the respective pQE vector and inserted in frame with the 
sequence encoding for the RBS. The ligation mixture was then used to transform the E^coli 
strain M15/pREP4 (Qiagen, Inc.) by electroporation. M 1 5/pREP4 contains multiple copies 
of the plasmid pREP4, which expresses the lad repressor and also confers kanamycin 
resistance (Kan^. Transformants were identified by their ability to grow on LB plates and 
ampicillin/kanamycin resistant colonies were selected. Plasmid DNA was isolated and 
confirmed by restriction analysis. Clones containing the desired constructs were grown 
overnight (G7N) in liquid culture in LB media supplemented with both Amp (100 ug/ml) 
and Kan (25 ug/ml). The O/N culture was used to inoculate a large culture at a ratio of 
1:100 to 1:250. The cells were grown to an optical density 600 (O.D 600 ) of between 0.4 and 
0.6. IPTG ("Isopropyl-B-D-thiogalacto pyranoside") was then added to a final 
concentration of 1 mM. IPTG induces by inactivating the lad repressor, clearing the P/O 
leading to increased gene expression. Cells were grown an extra 3 to 4 hours. Cells were 
then harvested by centrifugation. 
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The primer sequences set out above may also be employed to isolate the target gene 
from the deposited material by hybridization techniques described above. 

Example 2 

Isolation of A Selected Clone From the Deposited geno mic clones 

A clone is isolated directly by screening the deposited material using the 
oligonucleotide primers set forth in Example 1 for the particular gene desired to be 
isolated. The specific oligonucleotides are synthesized using an Applied Biosystems 
DNA synthesizer. The oligonucleotides are labeled with 32 P- -ATP using T4 
polynucleotide kinase and purified according to a standard protocol (Maniatis et al., 
Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Press, Cold Spring, NY, 
1982). The deposited clones in the pBluescript vectors may be employed to transform 
bacterial hosts which are then plated on 1 .5% agar plates to the density of 20,000- 
50,000 pfu/1 50 mm plate. These plates are screened using Nylon membranes according 
to the standard screening protocol (Stratagene, 1993). Specifically, the Nylon 
membrane with denatured and fixed DNA is prehybridized in 6 x SSC, 20 mM 
NaH.P0 4 , 0.4%SDS, 5 x Denhardt's 500 jig/ml denatured, sonicated salmon sperm 
DNA; and 6 x SSC, 0.1% SDS. After one hour of prehybridization, the membrane is 
hybridized with hybridization buffer 6xSSC, 20 mM NaH 2 P0 4 , 0.4%SDS. 500 ug/ml 
denatured, sonicated salmon sperm DNA with 1x1 0 6 cpm/ml 32 P-probe overnight at 
42 Q C. The membrane is washed at 45-50°C with washing buffer 6 x SSC, 0.1% SDS 
for 20-30 minutes dried and exposed to Kodak X-ray film overnight. Positive clones are 
isolated and purified by secondary and tertiary screening. The purified clone is 
sequenced to verify its identity to the primer sequence. 

Once the clone is isolated, the two oligonucleotide primers corresponding to the 
gene of interest are used to amplify the gene from the deposited material. A polymerase 
chain reaction is carried out in 25 (il of reaction mixture with 0.5 ug of the DNA of the 
gene of interest. The reaction mixture is 1.5-5 mM MgCU, 0.01% (w/v) gelatin, 20 (aM 
each of dATP, dCTP, dGTP, dTTP, 25 pmol of each primer and 0.25 Unit of Taq 
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polymerase. Thirty five cycles of PCR (denaturation at 94°C for 1 min; annealing at 
55 °C for 1 min; elongation at 72 °C for 1 min) are performed with the Perkin-Elmer 
Cetus automated thermal cycler. The amplified product is analyzed by agarose gel 
electrophoresis and the DNA band with expected molecular weight is excised and 
purified. The PCR product is verified to be the gene of interest by subcloning and 
sequencing the DNA product. The ends of the newly purified genes are nucleotide 
sequenced to identify full length sequences. Complete sequencing of full length genes is 
then performed by Exonuclease III digestion or primer walking. 

Example 3 
Screening for Galactosidase Activity 

Screening procedures for a-galactosidase protein activity may be assayed for as 
follows: 

Substrate plates were provided by a standard plating procedure. Dilute XL1- 
Blue MRF E coli host of (Stratagene Cloning Systems, La Jolla, CA) to O.D. 600 = 1 .0 
with NZY media. In 1 5 mi tubes, inoculate 200 ^1 diluted host cells with phage. Mix 
gently and incubate tubes at 37 °C for 15 min. Add approximately 3.5 ml LB top 
agarose (0.7%) containing ImM IPTG to each tube and pour onto all NYZ plate surface. 
Allow to cool and incubate at 37 °C overnight. The assay plates are obtained as 
substrate p-Nitrophenyl a-galactosidase (Sigma) (200 mg/100 ml) (100 mM NaCl, 100 
mM Potassium-Phosphate) 1% (w/v) agarose. The plaques are overlayed with 
nitrocellulose and incubated at 4 °C for 30 minutes whereupon the nitrocellulose is 
removed and overlayed onto the substrate plates. The substrate plates are then incubated 
at 70 °C for 20 minutes. 
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Example 4 

Screening of Clones for Mannanase Activity 

A solid phase screening assay was utilized as a primary screening method to test 
clones for (3-mannanase activity. 

A culture solution of the Y1090-£ coli host strain (Stratagene Cloning Systems, 
La Jolla, CA) was diluted to O.D. 600 =1 .0 with NZY media. The amplified library from 
Thermoioga maritima lambda gtll library was diluted in SM (phage dilution buffer): 5 
x 10 7 pfu/ul diluted 1:1000 then 1:100 to 5 x 10 2 pru/ul. Then 8 ul of phage dilution 
(5 x 10 2 pfti/ul) was plated in 200 ul host cells. They were then incubated in 1 5 ml 
tubes at 37 °C for 15 minutes. 

Approximately 4 ml of molten, LB top agarose (0.7%) at approximately 52 °C 
was added to each tube and the mixture was poured onto the surface of LB agar plates. 
The agar plates were then incubated at 37 °C for five hours. The plates were replicated 
and induced with 10 mM IPTG-soaked Duralon-UV™ nylon membranes (Stratagene 
Cloning Systems, La Jolla, CA) overnight. The nylon membranes and plates were 
marked with a needle to keep their orientation and the nylon membranes were then 
removed and stored at 4 °C. 

An Azo-galactomannan overlay was applied to the LB plates containing the 
lambda plaques. The overlay contains 1% agarose. 50 mM potassium-phosphate buffer 
pH 7, 0.4% Azocarob-galactomannan. (Megazyme. Australia). The plates were 
incubated at 72 °C. The Azocarob-galactomannan treated plates were observed after 4 
hours then returned to incubation overnight. Putative positives were identified by 
clearing zones on the Azocarob-galactomannan plates. Two positive clones were 
observed. 

The nylon membranes referred to above, which correspond to the positive clones 
were retrieved, oriented over the plate and the portions matching the locations of the 
clearing zones for positive clones wre cut out. Phage was eluted from the membrane 
cut-out portions by soaking the individual portions in 500 ul SM (phage dilution buffer) 
and 25 ul CHC1 3 . 
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Example 5 

Screening of Clones for Mannosirtase Activity 

A solid phase screening assay was utilized as a primary screening method to test 
clones for B-mrnnosidase activity. 

A culture solution of the Y1090-£ coli host strain (Stratagene Cloning Systems, 
La Jolla. CA) was diluted to O.D. 600 =1 .0 with NZY media. The amplified library from 
AEPII la lambda gtll library was diluted in SM (phage dilution buffer): 5 x 10 7 pfu/ul 
diluted 1:1000 then 1 : 100 to 5 x 10 2 pfu/ul. Then 8 ul of phage dilution 
(5 x 10 : pfu/ul) was plated in 200 ul host cells. They were then incubated in 15 ml 
tubes at 37 °C for 1 5 minutes. 

Approximately 4 ml of molten, LB top agarose (0.7%) at approximately 52 °C 
was added to each tube and the mixture was poured onto the surface of LB agar plates. 
The agar plates were then incubated at 37 °C for five hours. The plates were replicated 
and induced with 10 mM IPTG-soaked Duralon-UV™ nylon membranes (Stratagene 
Cloning Systems, La Jolla, CA) overnight. The nylon membranes and plates were 
marked with a needle to keep their orientation and the nylon membranes were then 
removed and stored at 4 °C. 

A p-nitrophenyl-B-D-manno-pyranoside overlay was applied to the LB plates 
containing the lambda plaques. The overlay contains 1% agarose, 50 mM potassium- 
phosphate buffer pH 7, 0.4% p-nitrophenyl-li-D-manno-pyranoside. (Megazyme, 
Australia). The plates were incubated at 72 °C. The p-nitrophenyl-6-D-manno- 
pyranoside treated plates were observed after 4 hours then returned to incubation 
overnight. Putative positives were identified by clearing zones on the p-nitrophenyl-B- 
D-manno-pyranoside plates. Two positive clones were observed. 

The nylon membranes referred to above, which correspond to the positive clones 
were retrieved, oriented over the plate and the portions matching the locations of the 
clearing zones for positive clones wre cut out. Phage was eluted from the membrane 
cut-out portions by soaking the individual portions in 500 ul SM (phage dilution buffer) 
and 25 ul CHCl 3 . 
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Example 6 
Screening for Pullulanase Activity 

Screening procedures for pullulanase protein activity may be assayed for as 
follows: 

Substrate plates were provided by a standard plating procedure. Host cells are 
diluted to O.D. 600 = 1 .0 with NZY or appropriate media. In 1 5 ml tubes, inoculate 200 
til diluted host cells with phage. Mix gently and incubate tubes at 37 °C for 15 min. 
Add approximately 3.5 ml LB top agarose (0.7%) is added to each tube and the mixture 
is plated, allowed to cool, and incubated at 37°C for about 28 hours. Overlays of 4.5 
mis of the following substrate are poured: 

100 ml total volume 



0.5g Red Pullulan Red (Megazyme, Australia) 

1 ,0g Agarose 

5ml Buffer (Tris-HCL pH 7.2 @ 75 °C) 

2ml 5M NaCl 

5ml CaCU(lOOmM) 

85ml dH 2 0 



Plates are cooled at room temperature, and thenm incubated at 75 a C for 2 hours. 
Positives are observed as showing substrate degradation. 

Example 7 
Screening for Endoglucanase Activity 

Screening procedures for endoglucanase protein activity may be assayed for as 
follows: 

1. The gene library is plated onto 6 LB/GelRite/0.1% CMC/NZY agar plates 
(-4,800 plaque forming units/plate) in E.coli host with LB agarose as top agarose. The 
plates are incubated at 37°C overnight. 

38 



WO 98/24799 



PCT/US97/22623 



2. Plates are chilled at 4°C for one hour. 

3. The plates are overlayed with Duralon membranes (Stratagene) at room 
temperature for one hour and the membranes are oriented and lifted off the plates and 
stored at4°C 

4. The top agarose layer is removed and plates are incubated at 37 °C for -3 

hours. 

5. The plate surface is rinsed with NaCL 

6. The plate is stained with 0. 1 % Congo Red for 1 5 minutes. 

7. The plate is destained with 1 M NaCL 

8. The putative positives identified on plate are isolated from the Duralon 
membrane (positives are identified by clearing zones around clones). The phage is 
eluted from the membrane by incubating in 500fil SM + 25\xl CHC1 3 to elute. 

9. Insert DNA is subcloned into any appropriate cloning vector and 
subclones are reassayed for CMCase activity using the following protocol: 

i) Spin 1 ml overnight miniprep of clone at maximum speed for 3 

minutes. 

ii) Decant the supernatant and use it to fill "wells" that have been 
made in an LB/GelRite/0.1% CMC plate. 

iii) Incubate at 37°C for 2 hours. 

iv) Stain with 0.1% Congo Red for 1 5 minutes. 

v) Destain with 1 M NaCl for 1 5 minutes. 

vi) Identify positives by clearing zone around clone. 

Numerous modifications and variations of the present invention are possible m 
light of the above teachings and, therefore, within the scope of the appended claims, the 
invention may be practiced otherwise than as particularly described. 
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WHAT IS CLAIMED IS : 

1 . An isolated polynucleotide selected from the group consisting of: 

(a) SEQIDNOS: 1-14 and 57-60; 

(b) SEQIDNOS: 1-14 and 57-60, wherein T can also be U; 

(c) polynucleotide sequences c ^mplementary to SEQ ID NOS: 1-14 and 57- 
60; 

(d) polynucleotide sequences which encode an amino acid sequence as set 
forth in SEQ ID NOS:15-28, and 61-64; and 

(e) fragments of (a), (b), (c) or (d) that are at least 1 5 consecutive bases in 
length and that will selectively hybridize to DNA which encodes a 
polypeptide of SEQ ID NOS: 15-28, and 61-64. 

2. A vector comprising a polynucleotide of claim I . 

3. A host cell containing the vector of claim 2, 

4. The method of claim 3, wherein the host cell is a eukaryotic cell. 

5. The method of claim 3, wherein the host cell is a prokaryotic cell. 

6. A method for producing a polypeptide comprising: 

(a) culturing the host cells of claim 3; 

(b) expressing from the host cell of claim 3 a polypeptide encoded by said 
polynucleotide; and 

(c) isolating the polypeptide. 
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7. An enzyme selected from the group consisting of: 

(a) an enzyme comprising an amino acid sequence set forth in SEQ ID NOS: 
15-28 or 6 1-64; and 

(b) an enzyme which comprises at least 30 consecutive amino acid residu. as 
an enzyme of (a). 

8. An enzyme of which at least a portion is coded for by a polynucleotide of 
claim 1, and which is selected from the group consisting of: 

. (a) an enzyme comprising an amino acid sequence which is at least 70% 
identical to an amino acid sequence selected from the group of amino 
acid sequences set forth in SEQ ID NOS: 15-28 or 61-64; and 
(b) an enzyme which comprises at least 30 amino acid residues to the 
enzyme of (a). 

9. A method for generating glucose from soluble cell oligosaccharides comprising 
contacting a sample containing oligosaccharides with an effective amount of an 
enyzme selected from the group consisting of an enzyme having the amino acid 
sequence set forth in SEQ ID NOS: 15-28, 61-63 and 64 such that glucose is 
produced. 

1 0. The method of cliam 9, wherein the sample is selected from the group consisting 
of dairy products, fruit juices, detergents, textiles, guar gum, animal feed, plant 

biomass and waste products. 

11. The method of claim 9, wherein the oligosaccharide is selected from the group 
consisting of maltose, cellobiose, lactose, sucrose, raffmose, stachyose, 
verbascose, cellulose, starch, amylose, glycogen, disacharrides, polysacharrides 
and pullulan. 
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M11TL CLYCOSIDASE - 2$G 
COMPLETE GENE SEQUENCE - 9/95 

; z i , AAA z r t' ;t A:ru r ta< rrA ^ ™ ••-<-'- --<■ — — • - 

'•' j i-V; TZ Zl' T (V t:AT rr,: AAT A<T t;AT ™ ™ <™ TC(; urv; t'AT oat re-,; ,;ac: 

My Mo Pro My ,,. r ,; lu Asp Pl „ Asn S- Asp ,,p Trp V, I Tr„ V. , A ,p ,. m ( ; lu 

™; r ™ a 7 ™ ™ w r: - ™ «r aac ccc cca <:<rr ta,- ™ AAT 
A..n „„ am au c Jy u . u v „ S( . r (:|y A , p , |)C Pro Ciu ^ [|(( ^Al 

IflJ TTA AAC CAA AAT CAC CAC CAC CTr: CtT ;AC AAC* CTC CCC CTT AAC ACT ATT At *a . 
61 1 A " ^ A - A ^ Hi. A-P ...» AM Clu Ly, Leu ciy £7 Trvr ft? ciy 

24 1 CTT CAC TGC ACT ACC ATT TTT CCA AAC CCA ACT TTC AAT CTT AAA CTC CCT CTA rxr „ 
Bl Val Clu Trp Ser Arg 11. PHe Pro Lys Pro Thr PHa Asn Va7 £ £1 £ ™ ™ £ 

301 CAT CAC AAC CCC ACC ATT CTT CAC CTA CAT CTC CAT CAT AAA CCC CTT CAA ACA CTT CAT 
101 Asp Clu Asn ciy Ser II* Val His v.l Asp v.l Asp Asp Lys Ale Val c£ Arg Lp 

III ru> ^ ff C CAC ° CC CTA AAC CAT TAC CTA C *A ATC TAT AAA CAC TGG CTT CAA 

121 Clu Uu Ala Asn Ly, Clu Ala Val Asn His Tyr Val Clu Mec Tyr Ly, Asp Trp vTI £u* 

4 21 ACA CCT ACA AAA CTT ATA CTC AAT TTA TAC CAT TGC CCC CTC CCT CTC TCC CTT CAC AAC 
HI Arg Ciy Arg Lys L.u He Leu Asn Leu Tyr His Trp Pro Leu Pro Leu Trp Hi^ £n 

161 Pro n° ST AGA CCC TCA CGC TO m AAC CAC CAC 54 0 



10 

! HII 
i.U 

240 
80 

300 
100 

160 
130 

420 
140 

460 

160 



130 

600 
200 

660 
220 



780 
260 

840 
280 

JO0 
500 



l«l t , u „ ; — * ^ v " lv - ™ n GGC TCC CTT AAC CAC CAC 

161 Pro lie Hit Val Arg Arg M*c Ciy Pro A«p Arg Ala Pro Ser Ciy Trp Leu Asn Clu Clu 

fUJ ^ GCC ^ TAC CCC CCA TAC ATT CCT TCC AAA ATC CGC CAC CTA CCT 

181 Ser Val Val Clu Pht Ala Lys Tyr Ala Ala Tyr II* Ala Trp Lys Hec Ciy Clu Leu Pro 

H\ ^ ATG AAC ^ CCC AAC ^ ^ TAT GAC CAA GGA TAC ATC TTC CTT 

^01 val M«c Trp Ser Thr M«c Asn Clu Pro Asn Val Val Tyr Clu Gin Ciy Tyr Mac Phc Val 

2 6 21 1 ^ rfH C f CA CC ° ^ TAC ^ ACT ^ GAA <^CT CCT CAT AAC CCC AGO AGA AAT 720 

221 Lys Ciy Ciy Ph« Pro Pro Ciy Tyr Lou S«r Leu Clu Ala Ala Asp Lys Ala Arg Axg Asn 240 

ll\ tT C ^ G ^ CCA CCG CCC TAT GAC WT ATT ^ CCC ACT AAC AAA CCT CTT 

241 Mat Ila Gin Ala His Ala Arg Ala Tyr Asp Asn Il« Lys Arg Ph« s.r Lys Lys Pro Val 

III ^ A T A CCT TTC CA> TCC TTC CAA CTA ^A CAC CCT CCA GCA GAA CTA TTT CAT 

261 Ciy Leu lie Tyr Ala Pha Cin Trp Phe Clu L«u Leu Clu Ciy Pro Ala Clu Val Phe Asp 

841 AAC TTT AAC ACC TCT AAC TTA TAC TAT TTC ACA CAC ATA CTA TCC AAC GGT ACT TCA ATC 
jiB1 Lys Ph * L ** SBr s « Lys Leu Tyr Tyr Ph. Thr Asp lie Val Ser Lys Gly Ser Ser lie 

901 ATC AAT CTT CAA TAC ACC ACA CAT CTT CCC AAT ACC CTA GAC TGG TTG GGC CTT AAC TAC 9 60 
301 lie Asn Val Clu Tyr Arg Arg Asp Leu Ala Asn Arg Leu Asp Trp L.u Gly Val Asn Tyr 320 

961 TAT AGC CCT TTA CTC TAC AAA ATC CTC CAT CAC AAA CCT ATA ATC CTC CAC GCC TAT GCA 1020 
321 Tyr Ser Arg Leu Val Tyr Lys lie Val Asp Asp Lys Pro lie He Leu His Gly Tyr Gly 340 

1021 TTC CTT TCT ACA CCT CCC CCG ATC ACC CCC CCT CAA AAT CCT TCT ACC CAT TTT CCC TGG 1080 
341 Phe Leu Cys Thr Pro Gly Ciy lie Ser Pro Ala Clu Asn Pro Cys Ser Asp Phe Ciy Trp 360 

10BI GAC CTC TAT CCT CAA CCA CTC TAC CTA CTT CTA AAA CAA CTT TAC AAC CCA TAC CCC CTA 1140 
3G1 Clu Val Tyr Pro Clu ciy Leu Tyr Leu l»u Leu Ly* Clu l.cu Tyr Asn Arg Tyr Ciy Val 380 

U4 1 CAC TPC ATC CTC AIT CAC AAC CCT CTT TCA CAC ACC Mill UAT CCC TTC AGA CCC CCA TAC 
3B1 Asn Leu lie Vat Thr KU » Asn Gly v.l Ser Asp Ser Arq As,, Ala Leu Arg Pro Ala Tyi 

1201 CTC CTC TCC CAT < ."IT TAC ACC CTA TO AA^ CCC CCT AAC UAC CCC ATT CCC CTC AAA CCC 
101 |.,mi Vol 5,. f ms v..! Ty. Ser Val Trp Lys AU AIh a, (1 <:,„ c:iy U e Pro V.H Lys Gly 

I-*t.l TAC OV .'A." Ti: ( ; AiU- ■i-ru ALA W AAT TAf CAU TK«: r At ; ( ; t ;c 'IT. • Ai:r: f A( ; AAA Tir 

** } '' /r L( '= ' ' ' ' S '" T ''f AK f > A::., Tyr i:i„ T. ,. At,. ,;.„ ,; W h,, Aru ].y:. 



1?00 
400 



1 ^b(l 



1 



Figure lcu 



WO 98/24799 



2/46 



PCT/US97/22623 



I t.M . ; t ;-r -l-i'A f ITl * ATf ; 
4,1 '-iv » V,i I H«<l 

i tit »yu; i;a4; ATI' 

I'Ih- Ai (i Chi ||,. 

Mil c Ac; TAA H < £ 



err i;a< th aaa act aai: 

Vn| A:-|. t'h. t.yj Tin l.yi 

tJCA AC.; ( AT AAi i^:A ATA 
A)/i Tlti II i A:;n t. | y | | ( , 



AAA At 'J ; TAT i - n ■ , ■, , , , A 

LV,; «'« Ty, ,„,„ A(m 

rrt: ^ .-fa .-a.; t 'at 

,,r " (;,,, (;jM H(;: 



At.* Cf t f-rA i rn . t („,, 

';••» aJ.* v.i i .i..n 

err aca i-rt: a'jt m.w 

1 Tin i.Kit i ii> 4nn 



Figure lb (Continued) 



WO 98/24799 PCT/US97/22623 

3/46 



- ?™ E e e - t - s e - e - « - - - - 

' AC ATT GAA CCT CCA cr, *'* ^ 

= ' - »" «- «r «. £ Z ™ - « - « - TCA ATT W (;AT CTC „ TCA 

CAC ACC CCT Ccr AAA " U ' Trp A * p V *' S«r - 

■ Thr - - - - - - - s £ s - - « - r - T, ca C , 

s ;~ - - r - - « — «* «* J J c Tyr Hii 6 

" C1U A,P c «" - -t ty. «„ T. ^ L ™ £f ^ C ^ -c tct 2 

ATC TCC TCC CCC ACA att ^ *'« Ph « S « 8, 

3d TAG AAC ACA CTC CTT C " ^ AS " «>. 10 

■•• ~ - «• 2 2 2 S = 2 S - - « - - ™ « £ « „, 

3" CAC TCC CAC TTA CCC TAC 'CCA CTT ^ 

- H, Trp . P Uu ^ « - - £ £ „ CCA TCC „ AAC CCA CAT ATA CCC „, 

«i CTC TAT TTC ACA CCA TAC CCA ACC TTT ^ 

» «. n. «. 2 ^ « - - - = = = » „ ; „ M c „ ,„ 

S«i CCC CCC CCT CAT CAA XXI* <rt> " 180 

181 Al« Pro Cly Hi. r*" 1 "*"*^^: ATA ATC CCC CCC Q C u , „ ^. 

o Giy Hi. Cln A S „ L «u Gin Clu Al, II. «. Ai . .7° C CT ™ ^ GAA . 600 

«01 CAT CCA CAT CCC CTr n~ ^ *** C1U 200 

::: ::: s ™™—^22222 s 

;;. - ~ ~ ™ £ 2 2 2 2 2 2 22 2 2 2 2 £ 
;;; S S S 2 2 - ~ 5 - - g ™ - El= « « 

: * = = 5 = = = = = 5= = 5 ---- = = sr 

~ r 2 2 222222 = - - s; - « « - « - = 
»■ -e22E^S222«s = = «--.-.-.- 

1021 TAT AAA n . ^ " U ^ ° 1U "0 

.« - - 2 2 2 2 2 £ £ 2 5 2 S 222^1— - - 

1 t * GA AC A CTT CAT CAT AAT rxr nr* 

, I' 2 2 ' - * = : " ~ ; " = " - ~ = k 2 e 2 r 

1 OAA CCA - ATC AAT CCA GAT r-rr 

- - *u a. a. au a- vT[ a C ^ ™ ^ c^ £ ^ ^ - - ™ - « T AAC ,00 

1201 TTC CAA TCC C A=P ^ ' 4 °° 

• : 2 ^ - .S 2 2 1-22222 2 2 2 2 2 2 2 2 r 
- '"-22 = 22^2222-2™- - 

v* -ui Ph« Lou Ly S Ser End 1 I 'J 



780 
260 

e40 
280 

900 
300 

960 
320 



Figure 2 



fin 

;:o 

I ..0 
40 

180 
60 

240 
80 

300 
100 

360 
120 



WO 98/24799 PC1YUS97/22623 

4/46 

9/95 

1 TTC ATA ACC TTT CCT GAT TAT TTC TTt' ttt „. 

. ' ~ "° - ~ - - - - = s ~ 2 s s r - 
« ... z 2 2 2 si s 2 - - - « ~ «. « « „ r .„ 

- .* ~ ... 2 x ™ E » « - - 5 s „ « „ 

241 CAT ATA CAT TAT GAG TCC m» ^ AlP 

«. ... «. nr x 2 « « « » « - „ „ M „ c 

»l COO ATA CAA CCT GTA Uc ACT CTT (■ 

* ... o, „. ... 2 S 2 2 X 2222222 X X 2 

: -55=2222222222222222 X 

"■== = £2=222222:22222222 

4 81 CAA CCA TAT ATT vrr- <~r~~ . 

- - o, ~ _ £ x X 2 5 s 2 x 2 X X X X X £ 
2 2 X 22 X X 2 2 2 22 2 2 2 2 2 2 2 2 

= =====222222222222222 

2 2222222222222222 £ 2 2 

-S2222222222222222222 
•« = 2 2 2 2 2 2 2 2 2 2 2 2 222222 2 
2 2 2 2 2 2 2 2 2 2 2 2 222 2 2 

901 CCT ACA CCA ATA tit r>»* -..„ 

*.-.«,... 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 

2 2 2 X X 2222 X X 2222 2 2 2 2 2 2 
2 2 2 2 2 2 2 2 2 2 2 2 2 2222 2 2 2 
2 2 2 2 2 2 2 2 2 2 222 2 2 2 2 2 2 2 
-! 2 2 2 2 2 2 222 2 2 2 2 2 2 2 2 2 2 2 



480 

160 

540 

iao 

600 
200 

660 
220 

720 
240 . 

780 
260 

840 

280 

900 
300 

960 
320 

1020 
340 

1080 
360 



«: 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 r » » ™ 

. V5 Tvr l-eu Ly5 Asn Leu 



^61 CAA TAA 1266 
4 2 1 Clu End 422 



1 140 
330 

1200 
400 

12&0 
120 



Figure $ 



WO 98/24799 PCT/US97/22623 

5/46 



360 
120 

420 
140 

4 BO 
ISO 

540 
180 

600 
200 



Corny *.*C* o»ne **iiw^nce 9/1$ 

■ ; :ic 2 5* si: 5 ™ * « - - - ™ «. „ „ Arc ^ (o 

CAy Gln ™* Clu H« C Oiy 20 

" ~™S£££222S2222S225- » 

2<1 CGA ATA GAG TOG AGC _ 

22 22222?; 2S2S 2 ... 

s 2: s s s 2 2 s 2 = e 2222™™ 

s 2 £ 2222 2 s s 2222222222 
..: 2 s s r 2 2 si s 2 2 - ~ - sssesss 

I." 2 2 £ £ 2 2222 2 2 2 2 2 2 2 2 2 2 2 

S2S22S22SS2222S2222 
SS22222222222SS = 2- S=S 

-"S222222SS22222S22222 
32 2 2 £ 2 2 2 2 2 2 2 2 2 22222522 

"«= = S2222222222222222S 
2 ===22222222222222222 
Si 222222222 2 2222222222 

-5=222222222222222222 

«a axt Tyr S «r Clu ? ro L y , Ph* Pro 5cr II* Pro L<u r . Ser 

>" 2 2 2 22222222 2 2 2 2222 2 2 
"Si 2 2 2 2 2 2 2 2 2 2 2 2 2 g 2 2 2 2 2 2 
»! 2 2 22222 2 2 2 2 2 2 2 2 2 2 2 2 2 
'2 2 2 2 2 2 2 2 2 2 2 2 22222222 2 



660 
220 

720 
240 

780 
2 60 

«40 

230 

300 
960 

1020 
340 

10S0 
360 

1140 
380 

1200 
400 

12*0 
420 

1320 
440 



Figure 4<x 



WO 98/24799 6 / 46 PCTYUS97/22623 

Z =f S £ 3; « ~ s - J- 5 ^ 

««1 fro Ar» «... *«C1T TAT XCC CCC A„ 



*- 'X 4 Aro 

130; cXa atc ccc cac Akf -, 

"1 U. At, 2£ ~ «« 15J0 

"J -J» fba ely 14U Oly «nd j 10 



» TCG CCC 


U80 


1 m> Alt 




ASA ACA 


1440 


Arg Thr 


4i0 


AOC AAC 


1500 




300 



Figure 4t (C ontinusd) 



WO 98/24799 PCT/US97/22623 

7/46 



I ATC CAA A(»Ci ATC CAT GAA ATT CTC TCT f'AC TTA ACT ACA GAG CA, 

I Mzi Glu Arj: Hi' A. M . Clu Ik Uu Scr Cm |. tfU Thr The Clu Gi„ 

61 CTC CCC GTT CGT CTT CCA CCA CTT TTT CCC AAC" CCA CAT TCC AO 

21 V.I Gly V.,l <;iy I.cu Pn. Qy U« Phc Cly A*n Pr» His Scr Ar* 

121 CCA GAA ACA CAT CCC CTT CCA ACA CTT CCA ATT CCT CCC TTT CTC 

41 Gly Glu Thr Hi« Pn. Vil p™ Arg Uu Giy n< Rm AU ^ Vj| 



ISI 
61 



241 



— ■„ Wl ucu ALA ^ otl UAA AAC CAT CAA AAC ACT 

AU Cly Uu Arp lie Am Pt.. Thr Arj Clu Asn Asp Clu Am Thr 

TTT CCC CTT CAA ATC ATC CTC CCT TCT ACC TCC AAC ACA CAC CTT CTC CAA CAA GTG 

Phe Pro V*i Clu lie M« i«, a i* c„ tl . *r " , . . ^ UAA GAA CTG 



« I Phc Pro Va| Clu lie Mel Uu AU Scr Thr Trp Am Ar 4 Asp 

301 AAA CCC ATC CCA CAA GAA CTT ACC CAA TAC CCT CTC CAT CTC CT 

101 Ly» AU Mci Cly Glu Clu V*i Ar 4 Clu Tyr Cly V.I Asp Vil Uu 

AAC ATT CAC ACA AAC CCT CTT TCT CCA ACC AAT TTC GAG TAC TA< 

As* He Hit Ar* Am Pro Uu Cyi Cly Arj An Phe Clu Tyr Tyr 

CTT TCC CCT GAA ATC CCT TCA CCC TTT CTC AAG CCA CTT CAA TCI 

Uu Scr Cly Clu Met AU Scr AU Phe V.i Lys Cly v.i Gin Scr 



361 

111 



421 
141 



481 TGC ATA AAA CAC TTT CTC 
161 Cys He Ly* His Phe V«J 



A AAC 


(.tc; 


AA(i 


; ctc 


CTT 


(VI 


l m 
l.y.\ 




l.v x 


I.cu 


Val 


>n 


\ crc 


<;cc 


GGT 


CCC 


CCT 


120 






Oly 


Ala 


AU 


Mi 


' CTC 


CCA 


CAT 


CCT 


CCC 




Uu 


Alj 


Asp 


Cly 


Pru 


60 


TAC 


TAC 


ACC 


ACC 


CCA 


240 


Tyr 


Tyr 


Thr 


Thr 


AU 


AO 


CTC 


CAA 


CAA 


CTC 


CCA 


300 


Uu 


Clu 


Glu 


V.I 


Gly 


100 


CTT 


CCA 


CCT 


GCC 


ATG 


360 


Uu 


AU 


Pro 


AU 


Met 


120 


TCA 


CAA 


CAT 


CCT 


GTC 


420 


Scr 


Glu 


Asp 


Pro 


V.J 


140 


CAA 


CGC 


GTG 


CGA 


CCC 


420 


Cln 


Cly 


Vt! 


Cly 


AU 


160 


CTA 


CTC 


CAC 


ACC 


ATC 


5*0 


Vil 


V.I 


Asp 


Thr 


lie 


180 



s rsr r z ~~ r - z- sr-s? z a* sr z s° r r s 



Lys Lys 

TCA CAC 

Cys Scr Gin 220 



661 
221 


AAC GAA 
Asn Glu 


TCG CTT 
Trp Uu 


TTC 
Uu 


AAG 
Lys 


AAG CTT 
Lys V*j 


CTC 
Uu 


ACC 
Arj 


CAA 
Clu 


GAA 
Clu 


TCG 
Trp 


CCA 

Cly 


TTT 
Phc 


GCC 
Gly 


CCT 
Gly 


TTC 
Phc 


GTG 
V.J 


ATG 
Met 


720 
240 


721 
2-11 


ACC CAC 
Scr Asp 


TCG TAC 
Trp Tyr 


GCG 
AU 


GGA 
Cly 


CAC AAC 
Asp Asn 


CCT 
Pro 


GTA 
V.I 


CAA 
Clu 


CAC 
Cln 


CTC 
Uu 


AAG 
Lys 


CCC 

AU 


GGA 
Gly 


AAC 
Asn 


CAT 
Asp 


ATC 
Mel 


ATC 
lie 


750 
260 


7SI 
261 


ATC CCT 
Mci Pro 


CGG AAA 
Gly Lys 


CCC 
AU 


TAT 
Tyr 


CAC GTG 
Cln V.I 


AAC 
Asn 


ACA 
Thr 


GAA 

Glu 


ACA 


AGA 


GAT 
Asp 


GAA 
Glu 


ATA 

tic 


CAA 
Clu 


GAA 
Glu 


ATC 
Itc 


ATG 
Mel 


840 
280 


841 
251 


GAG GCG 
Clu AU 


TTC AAG 
Uu Lys 


GAG 
Glu 


CCA 
Gly 


AAA TTG 
Lys Uu 


ACT 
Scr 


CAC 
Clu 


CAC 
Clu 


CTT 
V*l 


CTC 
Uu 


GAT 

Asp 


GAG 
Glu 


TCT 
Cys 


GTG 
V.I 


ACA 


AAC 
Asn 


ATT 
He 


900 
300 


501 
301 


CTC AAA 
Uu Lys 


CTT CTT 
V.| Uu 


GTG 

v.i 


AAC 
Ami 


CCC CCT 
AU Pro 


TCC 
Scr 


TTC 
Phc 


AAA 
Lyi 


CGG 
Cly 


TAC 
Tyr 


AGG 
Arg 


TAC 
Tyr 


TCA 
Scr 


AAC 
Asn 


AAC 

Ly* 


CCC 
Pro 


GAT 
Asp 


960 
320 


961 
321 


CTC GAA 
Uu Glu 


TCT CAC 
Scr Hn 


GCG 
Ala 


CAA 
Glu 


CTC CCC 
Vil au 


TAC 
Tyr 


GAA 

Glu 


CCA 

AIM 


CCT 
Gly 


CCC 

Ala 


GAG 
Giu 


CCT 

Gly 


CTT 
V.I 


GTC 
Val 


err 

Uu 


CTT 
U:u 


GAG 
Glu 


1020 
340 


1021 
J4| 


AAC AAC 
Asn Asn 


CCT CTT 
Gly Val 


CTT 
Leu 


CCC 
Pru 


TTC CAT 
Phc A.sp 


GAA 
Glu 


AAT 
Ami 


ACT 
Thr 


CAT 
llix 


GTC 

V;,| 


CCC 

AUt 


GTC 
Val 


TTT 
Phc 


CCC 
Cly 


ACC 
Thr 


CCT 
Cly 


CAA 
Cln 


I0S0 
360 


1081 
361 


ATC CAA 
He Clu 


ACA ATA 
Thr Ik 


AAC 
I y< 


CCA 

<;i y 


CCA ACG 
Gty Thr 


CCA 
Cly 


ACT 
Scr 


CCA 
Cly 


CAC 
As,. 


ACC 

Thr 


CAT 
H.s 


CCG 
Pru 


ACA 
Ar t 


TAC 
Tyr 


ACC 
Thr 


ATC 
Ik- 


TCT 
Scr 


1 UQ 
3*0 


1141 

n\ 


ATC CTT 
He I.cu 


gaa c;cr 

Glu (ily 


ATA 
Ik 


AAA 

\.ys 


CAA ACA 
Clu Ary 


AAC 

Asn 


ATG 
Met 


AAC. 
I.yv 


rrr 

Phc 


c;ac 

a. m , 


CAA 
Clu 


GAA 
Clu 


CTC 
Uu 


CCT 

A In 


TCC 
Sci 


ACT 
Tin 


TAT 

Tyr 


I2W1 
400 



Figure \5£L 



WO 98/24799 PCT7US97/22623 

8/46 



ACT 


A.ip 


rcr 

.Scr 


TGC 
Trp 


I2«) 

*ut 


GAG 
C,\u 


ATA 

lie 


AAC 


AAA 

Ljrs 


O20 

440 


CGT 
Cly 


CAC 
Clu 


CCA 

Cly 


TAC 
Tyr 


mo 

460 


CAA 

Clu 


CTC 
Uu 


ATA 
lie 


AAA 
Lys 


MiO 


CTC 
Uu 


AAC 
Asn 


ATC 

lie 


CCA 
Cly 


1300 
500 


CTC 
Uu 


CTC 
Vil 


TCG 
Trp 


CaG 
Gin 


1560 
320 


ATT 
tie 


AAT 
A_sn 


CCC 
Pro 


TCC 
Scr 


1620 
540 



T cT n 0 ' x rAC , AAA , AAl "' Ar ' : A,;A <;AA ACA ™> uaa TAT a«* 

401 Clu Glu T„ He Ljr» Ly> Mr, A,, GI U Th, C. U <a, T,, l. r , r „. A , R 

I2M OCA ACC CTC ATA AAA CCC AAA CTC CCA CAC AAT TTC CTC TCA CAA AAA 
«, C T„, V., >,c L„ P„. Lyl Ua p,,, Qu ^ ™ ™ ™ ^ AAA 

r ^ crcr r y - - « - - - - « 

r . c ; c :r crs 8 s- sr - - ™ - - - « 
r ^ £\r sr ^ c sry sr y ? L r z z° ? ^ 
r r f rsr s c ^ ^ ----- - - - - - 

s-y £ c r r y r r s c a^ s t v t 2: - ~ - 

r S*2T 27 s* £ ? « - Jf - - Jf tT r ^ £ A 

r sr r £ c r s-sr - r - s° r - - - 
sr ^ ^ c ^ c cTv- c c r « sr z s? 
r £\T zrsr £ c ? r ;: c aT r - y - r 
«?' - z — c a- c t ^ - s c 2? 

r ^< a s\t r r « - - - « - « m 
r £7 z? r sr- y sr 2* r sr f r ™ - - - - 
r ^ - -r y 57 ^ r 1- - sr sr- ^ 

ITS 6 A T^ T A J A A T ^ C A ACA CAT rrr crc err cac cga cac 

CCA TCA 1J66 
711 Pro End 722 



I6M 



Trp 


Thr 




Pro 


560 


TAC 
Tyr 


GTG 
V.l 


CGA 

Cly 


TAC 
♦ Tyr 


1740 


CGC 
Gly 


CTC 
Uu 


TCT 
Scr 


TAC 

Tyr 


1800 
600 


CTC 
Uu 


AGA 
Ar f 


CTC 
VtJ 


TCC 
Scr 


1*60 
620 


CTC 
V«| 


TAC 
Tyr 


ATC 
lie 


AAA 
Lyj 


1920 
640 


CAC 
His 


AAA 


ACA 
Thr 


AAA 


I9S0 
660 


AGA 
Arf 


CAT 
Axp 


CTT 
Uu 


CCC 
Ala 


2040 


AGC 


CTC 
Val 


GCT 
Cly 


CCA 
All 


2100 
700 


AAG 


AGA 
Af 6 


TTC 
Phc 


AAA 


2160 
720 



Figure 5\o(Continued) 



WO 98/24799 PCT/US97/22623 

9/46 

THXlUf 0C0CCU3 ACDII12JU 0I.TC 0 «I D A 3E ,l 8B/0 , 

■ - - - £ s £ % t. z % « s r ;r - c « - 

Ai* Arg Ciy i lc Thr (1* Thr II* 

z s r - ~ « ~ » z z = -. » „ «, „ „ „ r 

= = = S E E S = I" E Z E S 5 E E E £ - 



60 
20 



120 
40 



180 
60 



240 
80 



300 



= s s e e - S s s £ E E ~ « E - ~ - ~ » 
E E E E S E E S E E s « - - ~ » - „ „ = 

seseseeeeeeesssseeee s 



s ss=s=ss:s5 =5ss - - « = - s 

S=EE£ESSSSES£ESEEE£ES 
K SSEESEESSE S E E - - - - s - E 

S 5SSESS E E S =EEEES E S X S S 
2S SSSSSSS2SES S K E St a - - SB £ 
52 =5EEEESEESEEEEEE£SEE 
"==55KSEES::s=ES5E£S5 

301 ACA CCC ACC GAG CTA ACC CAT ACC TCC AAT err m 

.... ^ u. ... „„ ... „ „, - E E S E £ E E z E E £ E 

ssessesssesseeksssee 

™i K E ESS E E E S S E E E "SEE - ;;; 

E S = = E S S E T. S E E S S E SEES E 
'SEEEESSSSESSEEESEEEEE 
'SEEEESSESSSESSEESEEES 



480 

160 



540 
180 



600 
200 



660 
220 



720 
240 



780 
260 



840 
280 



900 
300 



960 
320 



1020 
340 



1080 
360 



1140 
3BO 



1200 
400 



"Si 5= = ESSEEESEEESEEESEEE 



EEEESEEEESESESS S," 



1260 
420 



1320 
440 



Figure 6 



WO 98/24799 PCT7US97/22623 

10/46 



THERXOCOCCOS CHITONOPHACDS CLYC03IDASE - 22G 
COMPLETE SEQUENCE - 9/95 

: e.e s s e e e z z e e e e s ~ = e « K 

S 2 S = 5 5 ^ E = E 2 £ s 5 » ~ 5 s - = « 
e - E E S E E E 5 E ~ = - ~ - «. r s s 

'2 K = E E 2 5 2 E E E - 5 2 - 2 2? £ E - £ 



60 

■ y 20 

120 
40 

180 
60 

240 

ao 



300 

xoo 



"! S £ = 5S5S = £ s s s s e s e e e - ~ 

."Skesseseesseeseeees » 
s esjkeeeeseeeeeeeeeeee «. 

".' E E E - E = = S E - = 

e e s e e e a 2 e e e s e s s e - s s = 

52 ££ES22EE2EESSEEES2SS s 
S S £ S E S E E E 2 E = c S S 2 2 2 S E E s 
S E E E E2E2EESE2E 52 2 2 E HI S 2 2 

» E 2 E E £ ESS E E E E E E E E E E S E E 
"I E S E 2 E E S E S EE E S E E E S E E E 
"! EEEES2ESSEE22SSE222E 
SiE2E22SE2SEE2S2SE2£S2 
E E S E E E E E E E E E S E S E E S S E 

= = KSESSEEESESSSSSEES r 
"EZSSZSSSSEESESSEESES 

'Si E E E E E E E E 5 E E E S E 2 E E E E E ... 

"Si 2 S E E S S E E E E E E E E E E E E E E iS* 



480 
160 



540 

lao 



900 
300 

960 
320 

1020 
340 

1080 
360 



1200 
400 

1260 
420 



Figure 7£L 



WO 98/24799 u/46 PCT/US97/22623 

«! SIS Z £ z ™ ™ r ™ r TC0 ** ™ ™ ^ ™ «* ™ 

T>r A.p v.! Arg Cly Tyr l«u Hi. Trp AL t.. u T hr A» P a«„ Tyr Clu Trp 

«; s ™ c c £ e e 2? e £ cT r r r cta ** c ™ *■* w ^ «■ *« 

Are wet Arg Ph. Cly l«u Tyr clu V.l A »n L.u II, Thr Ly. Clu Arg 



«: s k s = s 5 - = « - 5 « « « r - 2 «. 

SUl Ser Asn II. Arg Ly, clu He L«u Clu Clu Cly End 512 



1 tun 
460 

1440 
460 



1500 
500 



Figure 7b ( Continued) 



WO 98/24799 PCT/US97/22623 

12/46 



wplxtx am siQujDfcx - 10/95 



; S S £ SS JS E 2 5 5 J3 K £ S SJ «? « 

S2S 25 - « s s: s £ s *s 5 s s s s a ~ « ~ 

. :: f f :! 5 f - a ; " = * = = - = a s h s ~ « 

« «. - si ss K s "i s s s s a - E « E ~ « W4 

- E - ~ s g s E „ ■ 1' :; :; - 

3S 5 J2 K JS £ JS 3 S K 5 53 £ £ ff «■ - « ™ ~ - 

- skss e k v c n ^ - - - - - - - «, „ g , „ « 

»; - £p £ 5n ss s s ss js s £ e ^ *?= ^° - - - 

SO cca AC* aca err ,™ y * ° ly Trp Vil Aan 

- - - S S 55 K S S K S E S JS S S ffi J- E s r r< 
JS "• S? £ S S !S SI ffi » ? - ~ 

a s s s = 5 s s s ssss s s - = ? 1' = r 

721 CTT rir »r * * ly3 Leu A 1 * 

»; - : . K K S S SI S S JS K 2J « S So 

»> v" «. £ s is js s sssisssjsssajs'-" 

901 TTC T-n r A r lu Ajn **P ^ n 

;:; t r ~ " " : Esssg s5sss s K = 
r r r f s K 2; s s ™ = - - » - s s ^ ^ % 
r « 2 * s 52 s ? " = K " s = = - = s = - 

r - ^ en s: s js k s g: s: s b s k k is 2: 



120 

4Q 



GAG 780 

260 

280 

900 
300 

960 
320 

1023 
340 

1080 
360 



1140 
Lya Aap 360 



114 1 GAT / rnr 

3- *,P S ^ ^ S5 - ^ I- - « rjT CCA CA, ATC TAC W ,CA ATA .00 

1201 en OA gt, r y Tyt S « Ue 400 

v.i «- m. sts k s ^ sn ^ 5n ^ ™ ™ ™ r ™ « - « A 

ax ryr Val Thr Clu X 3n Gly He Ma A»p Ser 



L260 
•420 



Figure 8cu 



W0 98/24799 „, 4C PCTYUS97/22623 

13/46 



1261 
421 



K 5 55 = S S S5 55 !S S !S 55 K i» ffi K £ £ S SS 
S S.5SSSS SS S = K 5 S K S 5 s e « « 

si s s s s: s s c ss s; s k a a s s ss » 

«i ?™ S i£ £5 J£ 5£ Sf H K S S a ffi S S K iX % S £ 

1501 AAA AAG ATT GAA GAG GAA TTC CTC ACG TGA 1533 



1320 
440 

1360 
460 

1440 
460 



500 



501 Lys Lya rie Glu Glu Glu Leu Leu Arg Cly End 5U 



figure 8b(Continued) 



WO 98/24799 PCI7US97/22623 

14/46 



B.nfcim oouldi .ndoffXuc^a. <J7Q»1) 
c 18 27 36 4S 

5" MC ACA ATA CGT TTA GCO ACG CTC GCO CTC TCC CCA GTC. rrr In 

- *u «-« ^ 2 £ s: i s £ s s? s 

63 72 b: 90 

GAC GCC GAC CCC GGT A 
Aap Ala Asp Gly cly L 

117 125 «S U4 153 

TCC AAC 
Sor Arn . 

171 ieo ias 



117 !26 13S 

AK CGA GCC CTT TAC GCC ATC AAT AAC TO MC OCX Oli i£? 162 

ur ». ^ My .,„.,„ e « ~ "ESS s iS 

171 180 IBS 1QQ 

270 288 297 int 

= = = 5=-S5 = 5 = 5Sa = S22S 

441 *50 453 1SB 

TOG TCC ACC CCC GTC GCT CAG AAT CTC OCT CGC CCC an" r-»» r>ZZ 466 
Trp Trp Thr Glv V.l A la r.„ . T G CT GGC GGC GOT SAA CCC AAT CTG GAC 

^ Gly 41 Gin Afln »*» Glv Gly GlY GIu Pro A*n Leu Aap 



GGC GGC GGC GAA GCG CTG GTT GAA GGA GAC CCC AAT CTP Tir r~r*~ 5 " 

«» «y «. «. u. „ =lu 01y 4 S jS 2 £ 2 « «J « 

549 55a 567 57fi c «r 

657 g^r e aA 

GTT GGC ACC CAC GAC GAT OTA GTG AAA fSAA r»i *~ 693 702 

«. ~ - «. „ v„ 2 k s s: s s s s: s E 



Figure 9cl 



WO 98/24799 



15/46 



PCT/US97/22623 



711 720 72 a 

CTG CAC ACC TAT TTC GAA ACC GCC AAA AAA GCC CGC GCC AAA 'ttt* ?56 

" u ms Th * «- C1 « - *. ss s e s: s 5 E 

765 - 774 7 « 792 B01 



AAA ATC ACC GOT CCC CTG CCC CCT AAT nr w*. Jti 801 *10 

- «. - «, «. „ £ 21 » - « - « « « » s „ 

819 829 B37 fl ,< 

TC TC5 GTA CCC CW GAA CU Gffi -m, ^ _ 846 855 864 

- « « «. «. «. £ 5 » s - - s s - ™ « « 

873 382 B9i 

5=222222=222 = ~=- = 2 

= = i 5 = 5 2 5 5 £ 2 5 5 s 5 = = = 

-^ 8 ^ 590 909 inn a 

■ = 222222S2222222K= , =! 

1035 1044 10 e-» . 
GAA GGT GGC TOG GAT GAC AGC ATC AAC AXG ^ 1071 1080 

1089 1098 H07 

CAT TGC CTC GAG GAA TAT ATfi m~ 1135 1134 

t» «. 2 2 S 2 S 2 S s 2 2 2 £ 



1143 1152 



CAA ATO TGC GTG CGC^t" GTQ AAT^S ATC Ac/S GCC AW * 1188 
«- M. t W ^ ^ yal ^ ^ £ ^ « ATC ^ TAT «e ^ 

1197 1-206 1215 1231 

=222222222222222-2 

1251 1360 1269 127R nn 

AAC ACC GGA ATG TGO GAA ACA CTC CAC CTC TTC ACC rrr ^r- Hl 1296 

1305 131i 1323 1312 njt 

= = 222222222S.22« = = 'S 
2 222222 = s = = s- = 'S s= '2 

Figure 9b ( Continued) 



WO 98/24799 PCTAJS97/22623 

16/46 



»«nki« Bouldi •adoorlBOM... f37api) (continued) 

~ 14 " 1422 1431 "<° 1449 l«fl 

ACC CAC ACC CCC ACT GTC GCT ATC GAC CAT TTC CCA CTG GAT GGC CCC TAC CCC 
Thr Hi, Thr Ala V.l Ala II. A,p Aap P*. , ra U» !!£ S E£ 2£ 5 

1461 1476 H«5 1494 1503 1C1 , 

ACC CGC TTA CAC AAC CTG CCO GGG GAG GAA ACC TTC GTA TCT CAC CGA GAC 

Thr Leu Ar* Hxa Aan teu Pro Gly Giu Glu Thr Phe Val Ser S- £J £p 

1521 1530 1539 1548 isS7 , e ,, 

AAC GCC CTO GAA AAA GOT ACA GIG CSC GCC AGC GAC AAT ACG CTA ACA CTC CAC 
A " ^ Glu «y «» V*l Arg Ala s.r Acp Thr Si i£ S 

1575 1584 1593 1602 lfin 

TTC CCC CCT CTG TCC GTT ACT GCA ATA TTG CTC AAG GCC CGG CCC TAA 3> 
L-u Pro Pro Ua Ser Val Thr Ala II. L«u Leu Ly. Ala Ara Pro 



Figure 9a. (Continued) 



WO 98/24799 



17/46 



PCT/US97/22623 



CunviuLg Cone Sorjucm-* q ^. ^ ^ 

«U lie cy s Val ciu no PIl0 Cly ^ to £ ^ ™ ~; - — - ~- 
Lys Glu Lya K*n Ph . Val Glu Pb « Ala "~ ~ ~ ~ — - ~ 
Lys He Ser Gly Ar, Val cly" ^ £ ™ ~ ™ ~ ~ — — 

Lys Ala Pro Clu Lys Vttl ^ vri A^ A^n ^ s« i£ Gly ^ ^ ^ 
225 234 243 3S3 

v a i v *i asp Ph , s« Phe ^; ;^ ^ ciu ill ~?z '^i ^ ^ ^ ~ 

279 288 297 30S 3is 

--f^^^^f^^J^^^^^CTCCAGAGCcaCTATTO 
Thr Ala Ser Val Val Pro Asp Val Lou Glu Arg Am Leu Gin Ser Asp Tyr Phe 

™^!^f^f^^f!!^ C °f TTrCTC ATC GOV CAT OT 
Val Ala Glu Glu Gly Lys val Tyr Gly Phe Leu Ser Ser Ly^ lie" Ma" His Pr^ 
387 396 405 414 493 a-,-, 

Phe Phe Ala Val Clu Asp cly clu Leu Val All ^ IZ Gl^ T^r Phe Val 

Glu Phe Anp Asp Phe Val Pro Leu Glu Pro Leu Val CIl Glu Asp Pro Aan 

<9S 504 513 52? 53, ,.. 

aca ccc err err ctc gag aaa tac gcg gaa ctc ctc gga atc gaa aac aac 

The Pro Leu l^u Leu Glu Ly S Tyr Ala Clu Leu Col Cly tat Glu A^ A^n Ala 

i4,) 558 50" 1 576 585 soa 

ACA ^ ^ CAC ACA CCC ACT CGA TCC TGC ACC TCG TAC CAT TAC TTC CTT 

Ar* val Pro Lys lliB - nur r£Q , llr cly Tqp ^ ^ r ^ "~ ™ ~ 

Figure lQc- 



WO 98/24799 PCT/US97/22623 

18/46 



ilwmotoga mtriLima Mpha-oalaclosioW 
ConvI«L*# C4'J\ti Scqiienm ^ Q j 'j 

CAT CTC ACC TOG GAA CAG ACC CTC AAG AAC CTC AAG CTC COG AAC AAT TTC CCG 
Asp L« Xhr Ttp Clu Clu Thr I*u ^ Aan £ ^ Uy* j£ ^ 

TTC GAG GTC TTC CAG ATA GAC CAC OCC TAC GAA £2 CAC ATA CUC TGC CTC 

Ph. Giu Val Ph. Gi „ He ^ ^ £ ^ ««" £ j£ ^ ^ ^ ~ — 

711 720 729 73S tai 

Val Thr Ary Gly Asp Phe Pro 5«r Val Gl~u" Glu Ala" Lys Val lie Ala* ^ 

Asa Gly phe He P^o Gly lie Trp Thr Ala Pro" Phi ^ vll sZ Glu Thr" ^r 

A*p Val Phe Asn Glu His Pro Arp Trp Val Val Lys Glu Asa Gly Glu P^o Lys 
_ £7^ 832 331 500 309 Sift 

Met Ala Tyr Arg Aan Trp A*n Ly» Ly* Il« Tyr Ala L«u Asp Leu S«r Lys Aap 

527 3^6 945 954 963 073 

AAG ATC GGC TAC 

Glu Val uu Asn Trp Leu Phe Asp Lw Phe Ser S«r Uu Acb Lya Me~t Gly Tyr^ 

i~ 981 990 399 1008 1017 1026 

CCA GAA AGA AAA 

Arg Tyr Phe Ly* U« Asp Phe Leu Ph« Ala Gly Ala Val Pro Gly Glu Ary Lyi 

1035 1044 1053 10 " 1071 1080 

«^ ATT GAG ACG ATC AGA AAA 

Lya Aan U. Thr Pro lie Gin Ala Phe Arg Lys Gly He Glu Thr Ila Ar^ Lye 

, 1Q89 1090 1107 1116 1125 1134 

iT^ ^""^ CCC CCA 

Ala val Gly Glu Asp 5c r Phe lie Leu Gly Cya Gly S*x Pro Leu Leu Pro All 

UO Hb2 U61 1170 U79 US8 

GTG GGA TGC CTC GAC COC ATC AGO ATA OGA CCT CAC ACT GCG CCG TTC TOG GGA 

Val Gly Cys Vnl Arp Cly Met: Arg lie Gly Pro Ajp Tlir Alti Pro Phe Trp Gly 

Figure 10fc( Continued) 



WO 98/24799 PCT/US97/22623 

19/46 



**m<xj<i marilim* Aiphft-<jn IrtclusiiLujc 

«A «X ATA c£°L ^ r ^ ^ ^ ^ ^ ^ ^ 

Glu Hi, 2 x. Clu Asp Asn Cly Ala" ^ ^ ^ ^ ^ ^ ~ — — ~ 

ATA _^ „™ ^ ^ TTC ?OC U £ XAC GAC^ ^ ^ 

Ue l« xr, giu Glu Lys ^ ^ £ ;Z ^ oi: ^" ^ £ ~ 

- ACC^ CCA ^ ^ Arc ^ ^ ^ ^Ug 

^ te ^ Wy VU ^ U ^ «« "e HI Clu Ser Asp Asp Leu Str Leu 
03c ^§ „ T ^£ ^ ^ ^ ^ ^ ore «*£ 

v *i Arg Asp His cly Lys Lys Va l c« ^ ^ ^ ^ ^ ~ ~ ~ ~ 

^ C^S CTT CAA^ ATC ATC^TCG GAG GAT^ AG* TAC^GAG « „»£ 

- CCC^ ^ AAC G^ ATC ^ CAT CI^ AGC ACA^ 

Ser Gly ISrI*uS« Gly VaI Ly , „ # v ~ ^ ~~~ — — --- 

»C CAC^ OAA AAA^ ^ ^ TCC AAA AGA^GTC OIC ^ 

TVr His ^ Glu Lys Glu cly Lys Scr ^ ~ ™ -~ ~ -~ --- --. „_ 

~ ^ ^11? ™ ™ cao 1 ^ «, ^sg ^ 3 . 

Glu Asp Gly Arg Asn Phc Tyr Ph. ^ clu Gl~u cly «u A^ gIu ^ 



Figure 10c (Continued) 



WO 98/24799 PCT/US97/22623 

20/46 



Thtraotofft »ariti»a P-*aiuiana«e (AdLfJ) [& & P *■ 



ATG 


GGG 


9 

ATT 


GGT 


GGC 


18 
GAC 


GAC 


27 

TCC TGG AGC 


36 

CCC TCA 


GTA TCG 


45 

GCG CAA 


TTC 


54 
CTT 


Met Gly 


He Gly Gly Asp Ajgp 


Ser Trp Ser 


Pro Ser 


Val Ser 


Ala Glu 


Phe 


Leu 


TTA 


TTG 


63 
ATC 


GTT 


GAG 


72 
CTC 


TCT 


81 

TTC GTT CTC 


90 

TTT GCA 


AGT GAC 


99 
GAG TTC 


GTG 


108 
AAA 


Leu 


Leu 


He 


Val 


Glu 


Leu 


Ser 


Phe Val Leu 


Phe Ala 


Ser Asp 


Glu Phe 


Val 


Lys 


GTG 


GAA 


117 
AAC 


GGA 


AAA 


126 
TTC 


GCT 


135 

CTG AAC GGA 


144 
AAA GAA 


TTC AGA 


153 

TTC ATT 


GGA 


162 
AGC 


Val 


Glu Asa Gly 


Lys 


Phe Ala 


Leu Asn Gly Lys Glu 


Phe Arg 


Phe He Gly Ser 


AAC 


AAC 


171 
TAC 


TAC 


ATG 


180 

CAC 


TAC 


189 

AAG AGC AAC 


198 
GGA ATG 


ATA GAC 


207 

AGT CTT 


CTG 


216 
GAG 


Asn 


Asn 


Tyr 


Tyr 


Met 


His 


Tyr 


Lys Ser Asn Gly Met 


He Asp 


Ser Val 


Leu 


Glu 


AGT 


GCC 


225 
AGA 


GAC 


ATO 


234 
GGT 


ATA 


243 

AAG GTC CTC 


252 
AGA ATC 


TGG GGT 


261 

TTC CTC 


GAC 


270 
GGG 


Ser 


Ala 


Arg 


Asp 


Met 


Gly 


He 


Lys Val Leu 


Arg He. 


Trp Gly 


Phe Leu 


Asp 


Gly 


GAG 


AGT 


279 
TAC 


TGC 


AGA 


288 
GAC 


AAG 


297 

AAC ACC TAC 


306 
ATG CAT 


CCT GAG 


315 

CCC GGT 


GTT 


324 
TTC 


Glu 


Ser 


Tyr 


Cys 


Arg 


Asp 


Lys 


Asn Thr Tyr 


Met His 


Pro Glu 


Pro Gly Val 


Phe 


GGG 


GTG 


333 
CCA 


GAA 


GGA 


342 
ATA 


TCG 


351 

AAC GCC CAG 


360 
AGC GGT 


TTC GAA 


369 

AGA CTC 


GAC 


378 
TAC 


Gly Val 


Pro 


Glu 


Gly 


lie 


Ser 


Aan Ala Gin 


Ser Gly 


Pbe Glu 


Arg Leu Asp 


Tyr 


ACA 


GTT 


387 
GCG 


AAA 


GCG 


396 
AAA 


GAA 


405 

CTC GGT ATA 


414 

AAA CTT 


GTC ATT 


423 

GTT CTT 


GTG 


432 
AAC 


Thr 


Val 


Ala 


Lys 


Ala 


Lys 


Glu 


Leu Gly He 


Lys Leu 


Val He 


Val Leu 


Val 


Asn 


AAC 


TGG 


441 

GAC 


GAC 


TTC 


450 
GGT 


GGA 


459 

ATG AAC CAG 


468 

TAC GTG 


AGG TGG 


477 

TTT GGA 


GGA 


486 
ACC 


Asn 


Trp 


Asp 


Asp 


Phe Gly Gly 


Met Aan Gin 


Tyr Val 


Arg Trp 


Phe Gly Gly Thr 



49S 504 513 522 531 540 

CAT CAC GAC GAT TTC TAC AGA GAT GAG AAG ATC AAA GAA GAG TAC AAA AAG TAC 



His His Asp Asp Phe Tyr Arg Asp Glu Lyu He Lys Glu Glu Tyr Lys Lys Tyr 

Figure lie*- 



WO 98/24799 



21/46 



PCT/US97/22623 



Tbermotosa maritime 0-aanaiaas« 



(continued) (c- G-fvl^ 



549 558 567 576 585 594 

flf Iff HI °I A ^ GTC XAT A<X TAC ACG <** CTT CCT TAC AGG CAA 

Val Ser Phe Leu Val Asn Hi. Val Asn Thr Tyr Thr Cly Val Pro Tyr III Glu 

603 612 "1 630 639 fi ,. 

f-f Iff *ff Clf *If ^ ^ CTT AAC GAA CCG CCC TOT GAG ACG dc 

Glu Pro Thr He Met Ala Trp GXu Leu Ala Asn Glu Pro A^g c^s Glu Thr A^p 

657 666 S ? 5 684 693 

AAA TCG GGG AAC ACG CTC GTT GAG TGG GTG AAO GAG ATG AGC TCC TAC ATA AAG 

Ly« Ser Gly Asn Thr U« Val Glu Trp Val Lys Glu Met Ser III T^r III 

711 720 729 738 747 7ss 

-fl flf ** A ^' AAC CAC CTC GTG GCT GTG GGG GAC GAA CCA TTC TTC AGC AAC 

Ser Leu Asp Pro Asn His Leu Val Ala Val Gly Asp Glu Gly III HI sex A^n 

765 774 7 " 792 sol » 10 

TAC GAA GGA TTC AAA CCT TAC GGT GGA GAA GCC GAG TGG GCC TAC AAC GGC TCG 



Tyr Clu Gly Phe Lye Pro Tyr Gly Cly Glu Ala Glu 



Trp Ala Tyr Ann Gly Trp 



819 828 "7 846 855 864 

TCC GGT GTT GAC TGG AAG AAG CTC CTT TCG ATA GAG ACG GTG GAC TTC GGC ACG 

Ser Gly Val Asp Trp Ly» Lys Leu Leu Ser lie Glu Thr Val Asp Phe Gly Thr 

873 882 891 900 909 o la 

TTC CAC CTC TAT CCG TCC CAC TGG GCT GTC ACT CCA GAG AAC TAT GCC GAG TGG 

Phe His Leu Tyr Pro Ser His Trp Gly Val Ser Pro Glu Asn T^r All Gin 

927 936 9 « 954 963 9 72 

GGA GCG AAG TGG ATA GAA GAC CAC ATA AAG ATC GCA AAA GAG ATC GGA AAA CCC 

Gly Ala Lys Trp He Glu Asp His lie Lys Ha Ala Lys Glu III Gly l^l Pro 

981 990 995 "OS 1017 1026 

GTT GTT CTG GAA GAA TAT GGA ATT CCA AAG ACT GCG CCA GTT AAC AGA ACC GCC 

Val val Leu Glu Glu Tyr Gly He Pro Lys Ser Ala Pro Val /ll Arg Thr Ala 

1035 1044 1053 1062 i 0 71 1080 

ATC TAC AGA CTC TGG AAC GAT CTG GTC TAC GAT CTC GGT GGA GAT GGA GCG ATG 

He Tyr Arg Leu Trp Aan Asp Leu Val Tyr Asp Leu Gly Gly Asp Gly All III 

Figure 1 lb ( Continued ) 



WO 98/24799 



22/46 



PC17US97/22623 



1089 109 * 1107 

" = = = »£J - «■£ AM „»s 

- ^ « ^ „. 0ly „. oly ~ ~ - - -- ~ --- ... 

1143 1152 1161 1MA 

» » « « « „ « „ m cro M -° ^ ^ b ™ 

„ c ly Pta «, X1 . - - - - - ~ --- ... ... 

1137 1206 1215 

- U. ~, «. „. „. «.„ _ ~ ~ ~ - - - --- _ ... 

12 51 1260 !269 

- *. _ ». „. ^ to L „ a«p s « i~ ~ - - - -- --- 

1305 1314 " 1323 

™ =CT C=T TO M M ^ „ c ^ ^ m » ^ ^ 

« ~, M. = ly Val n. „ ^ s „ ~ - = -;- ~ ~ --- -.. ... 

13 59 use i 377 

».i «• ^ ^, wl Ph . Slu „„ „„ „ : » ~ - --- _ ._ ... „ 

1413 1422 1431 

- ~ ™ ™ ~ » « «- « « « - „ 

«y - ».„ u. Tte ^ ^ n . p „ - ~ ~ --- _ --- ... _. 

1467 K76 14P5 

~ = ~ ™ ------ «•£ « „»» m 

- «. p.. «u «, *. «. vsl S s ~ -- - -- - - 

«. =1. ^ w VI1 ^ M . M „ ~ ~ ;- ; --- --- ... ... ... 

1575 1584 1593 , rn , 

v. t,. „ B „„ s „ 01y ^ ^ ~ -;- --- ... ... „. „. 

Figure 1 10( Continued) 



WO 98/24799 



23/46 



PCT/US97/22623 



1629 163B 1647 lg cr 

II- Olu Trp X,„ cly Glu val cly ^ ~ - — ~" ~- --- --- --- 

CCC CCA^ ,CC CAC^ GAA ^ ACA CTA^ AGG AAC^ GAA AGA^CTC 
Pro Gly Ly3 Ser ^ Trp Glu Glu - ~ - - ~~ --- --. 

^ ^ T. ^ « ™^ - TAC"S CCA AAC^GTC GAG GGA^CTC 

s« giu cy 9 giu ii. ^ u 0 i tt ^ ~ ~ ~ ~ ~ - - --- --- 

1791 1800 1809 I9ia 1M - 

AAC CCA AGG TTO AGG « TAC GCG GTT CTG AAC CCC GGC t^S AAG ATA^GGC 
Hr. Gly ax. L.u Ar S Pro Tyr Al~ vl! ^ ~ ~ ~ ~ ~ "~ ~ 

«C GAC 1 ^ AAC AAC^ AAC c^oS ACT OCa'S ATC ATC^ACT TTC GGC^GGA 
A*p M.t A, n A*„ XI. xsn val 0x1 III HI ^ "^ "~ ~ ™ "~ ~~ 

1859 1908 1S17 ,, 2fi ,., c 

AAA GAG TAC AGA ACA TTC CAT GTA AGA ATT GAG TTC GAC AGA ACA GCG GGG^GTG 
Lys Glu Ty* Ar g Ar 5 Ph. Hi. Val ^ n e ^ III A^p A^ Gl'y V^l" 



1953 1962 1971 1980 



^^^^^^f^^"' - AGO TAC GAT GGA CCG^TT 

Glu U.U His u. cly Val Val Gly Asp £ £ ^ ~ ~ p ~ ~ ~ 
2007 2Qi 6 2025 

TTC ATC GAT AAT GTG AGA CTT TAT AAA AGA ACA GGA GGT ATG TGA 3 - 
Phe II. A.P Asn V.l Arg L*u Tyr Lyi Irg T^r Gly Gly Met "~* 



Figure lid (Continued) 



WO 98/24799 PCT/US97/22623 

24/46 



9 18 27 



« - >™ «. «. „ „ 01y - - ~ -- _ ... ... ... 

~ «, u. ta3 tes ~ - - 

117 135 

V, «. Ph . _ n . Ly . - = q - = --- _ _. _. ... 

a. *,„ to ^ „. „- - - - - ~ - - _ _ ... 

225 234 243 

U. -» ^ n „ ~ ~ - - ~ ~ --- ... ... 

279 288 2 97 

» ^ „ «, Vll „„ fc - - - - _ _ ... „. ... ... 

333 342 tci 

«» « ,„ tl . „„ L „ s „ a - - - - ~ ~ _ __ _ _ 

387 396 405 

«. „. v., „« « ;;; - - = - - _ --- ... 

441 4 ^0 459 

- V, «. V„ ^ „. u _ His £ £ = ~ ™ ~ - - ~ --- 

495 504 513 
ATA GTG GCA AGG GAG AAG GCC CTC ACA AAP rxr III , 531 540 
.„ MC A.TC GGC TGG CTC TCC CAG 

Ile vai ai - ^ Giu ^--ai. Leu ^ Asp ^ u~: ;r ; 

~<P Arg lie Gly Trp Val Ser Gin 

Figure 12£L 
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Ala 



A»II 1* p-M«nao.ida.» (tfDOBl) (continued) 

549 558 567 576 5 85 

!H ™ ^ ^ TAT ^ ^ TAC ATC GCC CAT GCG CTC GGA 

Arg Thr Val Val Glu Phe Ala Lys Tyr mI Ma 'ryl ^ ^ ™ ~ ^ ~ 

603 612 621 630 639 

GAC CTC GTG CAC ACA TCC AGC ACC TTC AAC GAA CCT ATG GTA GTT GTG GAG SJ 

Asp Leu v«l Asp Thr Trp Sar Thr Phe Asn Pro Met vll Val Val Glu 

657 666 675 «84 693 , n , 

™ flf ^ TO OT 000 000 GTC ATO CCC GAG <£l 

Gly Tyr Leu Ala Pro Tyr Ser Gly Phe Pro Pro Gly Val Met ten Pro Glu 

711 720 729 738 747 75fi 

GCG AAG CTG GCG ATC CTC AAC ATG ATA AAC CCC CAC GCC TTG GCA TAT AAG ATC 

Ala Lys Leu Ala He Leu Asn Met He Asn Ala His Ala Leu Ale lyl Met 

765 774 783 752 801 810 

ATA AAG AGG TTC GAC ACC AAG AAG GCC GAT GAG GAT AGC AAG TCC CCT GCG GAC 

lie Lys Arg Ph. Asp Thr Lys Lys Ala Asp Glu Asp Sec Lys Ser Pro All 

819 828 *37 846 855 864 

GTT GGC ATA ATT TAC AAC AAC ATC GGT GTT GCC TAC CCT AAA GAC CCT AAC GAT 

Val Gly lie lie Tyr Asn Asn lie Gly Val Ala Tyr Pro Lye Asp Pro Asn A^p 

873 S82 891 900 909 91B 

CCC AAG GAC GTT AAA GCA GCC GAA AAC GAC AAC TAC TTC CAC AGC GGA CTG TTC 

Pro Lys Asp Val Lys Ala Ala Glu Asn Asp Asn Tyr Phe His Ser Gly hll Phe 

927 «6 945 954 963 972 

TTT GAT GCC ATC CAC AAG GGT AAG CTC AAC ATA GAG TTC GAC GCC GAA AAC TTT 

Phe Asp Ala lie His Lys Gly Lys Leu Asn He Glu Phe Asp Gly Glu As"n Phe 

981 990 "9 1008 1Q17 1026 

GTA AAA GTT AGA CAC CTA AAA GGC AAT GAC TGG ATA GGC CTC AAC TAC TAC ACC 

Val Lys Val Arg His Leu Lys Gly Asn Asp Trp He Gly Leu Asn Tyr Tyr Thr 

1035 1044 1062 1071 1080 

CGC GAG GTT GTT AGA TAT TCG GAG CCC AAG TTC CCA AGT ATA CCC CTC ATA TCC 

Arg Glu Val Val Arg Tyr Ser Glu Pro Lys Phe Pro Ser lie Pro Lqu III III 

Figure 12b (Continued) 
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^ 11 <««»> (continue, 

1089 10 $a nm 

- * o» « , „ „ ^ oly ^ - - ~ --- ... _. 

1 1 J 1 



« _cc= «. « «£ „ M - i<t ^ ^ 

- «, « ,„ „ s „ n . - - - - - - _ ... ... 

1137 laoe i 31 f 

- ... v., „ u . ^ - --; - - - _ _ _ ... _. 



125i 1260 n « 

« - «. « = ^ = „ ^ „ c „- T4c at - ^ ^ 

~ «. s „ „ - - - - - _ _ _ ... 

1305 1314 1377 

~ ^ a. «. «. „. H . ^ - -;; - ~ - _ --- _. ... 

13 59 1368 1377 

==:f?;f?— — -« ^ „ „»« 

«» - u. Wir Aap x«n Jyr du Trp « £ ^ = = ~ - - - 

~ ™™ ± 2 ~« - - s: „ ^ M 

- * v., „ L . u a . _ = - - ~ - - --- ... ... 

1467 1476 I4fls 

«. n. * *, w 01 „ = - » - - - --- ... ... 

J-sai 153 o 153s 

Glu Phe Leu Lys Gly Glu Glu Zy^ 



Figure 12C(Continued) 
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45 54 



0CI/4V XadoglucanM. 03Q»JL) 

«.« v> „ u «, Hi a Ph . „ a „, Vil 1<U ~ ;-- ~ ; -- --- ... ... 

™ 2 ... ™ » *? - 101 - •» « ~ « « « « „ £ 

- l. u n . s « „ „, „. sly -;; - - - - --- --. ... _ 

117 126 135 m 

-~ -.^^f ^^---^caScM tac aa" 

se, Met cau can s.r *i «. Glu s '~ £ ^ £ ^ £ ~ ~ ™ 

171 180 183 tqfl 

« ec v.i « y Lya Giy m ^ Ile ~ ^ - - "-- ~ ~ --- --- 

225 234 243 

!^ « ^ ^ ^ ^ ATT CW «, «xx ttT TTT CM ATO IS AAO AAA S 
Gly A!. Trp OXy Val Aro H. «U„ £ Cl'u ^ ^ clu ^ £ ~ ~ ~ 



279 288 297 30S 315 



« A TTT CAT TCT an ACS ATT CCC ATA ACA « ^ OCA CAT £ TCC CAA £J 
.Cly Phe A-p ser V.l Ar, „. Pro n . ~ £ ~ ~ ~ ~ ~ --- 

AGG AAT TTC CTC GAA AGA GTT AAC CAT GTT GTC GAT 
P» P„ Tyr A S p xi. A.p Ax g A,„ £ £ ^ ~ ~ ~ ~ "~ ~ 

387 39 « 405 4ii 

AGG CCT CTT GAG AAT AAT TTA ACA GTA ATC ATC ACG CAC CAT TTT GAA GAA 

Ars Ala ^ Glu ^ n ^ ™* ™ «: a:; ;^ hI; ^ £ ; lu - 

441 450 459 ififl 

CTC _TAT CAA GAA CCG CAT AAA TAC GGC GAT GTT TTG GTG GAA ATT TGG AGA CAC 

L9U *** Cln clu pro Asp ^ ^ £ vi; £ ai; x": ^ ^ ~ 

xi. ax- Lyo Phe Ph8 LyB Asp ^ Pr ; ~ ; ™ - -- ; --- --- --- 

Figure 13CL- 
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---^ ^ ^ ^ ^ ^ ^ - - OCX OT ^ ^ ^ ^ 

Glu Pro Al. Gin Leu Thr Al. Glu Ly .' £ ~ " a ~ ~ - - 

603 612 «21 fi , ft 

CTC AAA GTT ATC AGG GAG AGO AAT CCA ACC CGG A^ m A „ £ ^ ^ £ 



Leu , ya v*l Xle a,. Glu S .r A,» Pro ^ ^ ~ ~ ~ ~ ~ ~ ~ 

557 666 675 t*A 

!!? ~ 111 *« CTA £ „ A GTC £c CAC AAA £ 

A,n Trp Ala His Ty* Ser Ala Val Ar 3 Se r ~ ~ ~ ~ ~ ~ ~ - 

711 720 729 via 

-?f ™ ™ « ™ « ™ "cuetr^^wwSc^ ™ 

n. n. v«i s « Ph . „ ia ^ ^ clu ~ ™ — ~ -~ --- -~ 

765 77 * 783 7Q-3 

CAA TGG GTT AAT CCC ATC CCA CCT GTT AGO GTT AAG TG3 AAT GGC GAG GAA J£ 

«« txp Val >, n ^ X1 . Pro ~; ~ -~ ™ — — --- --- --- -~ 

813 828 837 fl4fi 

~ -IT ™ ^ ^ ^ ^ ™ ^ ™ ™ GAC TO GCA AAG J£ 

Glu II. A3 n Gin XI. Ara S« hI," £ £ ^ Val ™ ~ ~ ~ ~ — 



873 882 B91 9 00 909 



918 
ATG 



a™ a.„ vai Pro ii. p h . ^ ;i; -~ ~ ™ ~ ~ ;~ ~ — — --- 

927 936 «S 954 a*, 

~. ™ ^ ?!T !?? *f ? ™ ™ 010 AGA A « « 22 caa ttt £j[ 

Asp Ser Ars vai Ly3Tr7Th - *~ vi: ^ i;: ~i ~i --- ;~ --- 

999 inriR 

H! ^ !.! !!! ?! !!! ^ m ^ «» « « ata 1 ?^ ^ aga 1 ^ 

Ph. Ser Tyr Ala Tyr Trp Glu P he Cy, Zl Gly ^ ^ ™ ^ 

1035 10*4 1053 iq£2 

TCT CAA AAC TOG ATC GAA CCA TTG CCA ACA OCT CT* G„ GGC^ACA GGC AAA^GAG 
Ser Gin A S „ Trp U, Glu Pro L.u III ^ Vai V^l Gly ^ ^ ™ ^ 



TAA 3 * 



figure 13b (Continued) 
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(6GP3) 



5 ' *!? !!! *f* ^ «° « ™ - 4 s aac «« ^ ^ ^ Jl 

Mec Asp Leu Thx Ly , val cly x - -~ - - - ~ ~ --- --- ... 

xsp vn Ala , ya ^ ^ Ph . n> Glu na - ~- --- --- --- _ 

L 1 ! ^ « «| ™ ATT TTC TAC GAA AAA CCA GAC aS TCT CCC a£ 

a. Lau Gin ax y v,i « u «« a. ill £ ^ - ~ ~ - ~ - 

171 1S0 189 1QR 

ATC ™ TTC CCA CAG CCA ACO TCO AAC AAG GTG ATC GAG GCT TTT CTG ACC AAT 
n. Phe P ha Ala 01„ Ala ax 8 s« Z~n £ ^ £ ^ ~ ~ ~ ~ ~ 
" 5 234 "3 252 

Pro Val Asp Thr Ly 3 Lya Lys ^ ~ ~ ~" ~ ™ "~ — — — 
279 288 297 

^ff^^^^^^occ.rcar^^c at* 

H. Pro Val Ser Ar* V.l Glu Ly8 xii £ ^ £ ^ ^ ^ ™ ™ ~ 

333 342 35! lcn 

TAC « ACA ATC CTC „T TCT OAA CTG AAA GAA GAA CAC £ AGA AAA 2 

^ -VI ax, n. w ^ u Ser Glu ~ ~ ~ ~- — -- ; ~ ~- -~ 

387 396 405 in 

~ ~ ~ *If ™ ^ TAC **» "» « ^ «c Arc S atg gag S 
V.1 «» >a n. «. Glu Gly ^ - - - ~ ~ ~- --- --- --- 

441 450 459 Jfip 

^^^^^1^^^^^^-cc.atatSccacaoaac 

Leu Asp Aap Tyr Tyr Tyr A 3 p oiy IZ ~al'y III vll Z'r III III l y l 

495 504 513 

-!! T. !!! ^ !! c ccc ^ ^ ^ ^ cta a^ ^ err crc ^ 

Thr Ila Ph . Arg V.X Trp Ser v.'i ^ ^ ~ C~ ^" ~ ™ ~ 

Figure 14*— 
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T*.«otoa. ^ritiM. ^llul.n„. <tfo „, (continue) 
549 sg7 

--^^^...^-^^^^^ 

^ a,„ G ly Glu ^ ^ GlH ^ ^ w - ~- ~ ~ --- --- ... ... 

A 3 n G ly v a a Trp „„ U4 yal Val G - u - ~ ~ ~ --- --- ... _ ... 

TVr Gin Leu Glu Asn Tyjr Cly L y3 I1b £ £ £ ^ ~ £ ~ ~ ~ 

* la val Ala *- Aon gi * «« *« *• vii ™ L'; L"u' £ ^ ^ 

765 774 783 7Q5 

Pro Glu Gl y Trp Glu to ^ ^ ^ - "~ ~ ~ ~ ™ ™ --- 

819 S28 837 aj , 

- !- -!! ™I ™ ^ ™ « *« ACA £ « GAA I£ TCC OGC, £ 

II- II. Tyr Glu Ila Hi. XI* ^ ^ ~ ~ ~ ~ ~ ~ ~ — ~ 

873 882 891 onn 

A3n Ly, Gly Lau Tyr t*u Gly ^ £ ^ ~ ~ ~ ~ — — ™ 
927 gjg 345 

« y v«i ^ Thr Gly L . u ser m ; ~ ~ --- --- --- --- «- --- --- 

981 990 qaq 1nnB 

ATA ----- ^ TAC ACA GCC GAC CAA CT. GAT^AAA GAT ^ 

no I.* Pro Pha Phe ^ Phe ^ ^ G1 ; - ~- - ; - ~ --- --- ... 

AAG TAC TAC AAC TGG^GGT T»r- ^,^5" 1062 1071 1080 
1 „. ™ TAC CCT CTG TTC ATG GTT CCG GAG GGC AGA 

«*. Tvr Tyr Aan Tr p Gly Tyx ^ p ~ ~ ~ ~ ~ ~ ~ ~ - 

Figure 14b (Continued) 
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i»OF3> (continued) 

1085 1098 no-? 

» ,„ ly , „ P „ ai ; » - - ~ - ; - --- _ _ 

1143 1152 1 1 /?i 

« « « « _„c ^ « „ ^ Kr „™ m ^ ^ ^ 

V*l Ly, Ala Lau Hia Lys His Gly H. ciy Val" n" C" T - 

eiy Val Ha Mee ^ „ ec m ^ ^ 

1137 1206 i2i« 

~ ~ ~ " S « « ™ ™ « a. « l S „ 
«. - .v, «, n. «, „. « ~ ~ = - - - - --- _ 

1251 12 60 12fi0 

~ «, a. «. Qll ~ -- ~ - - -. 



1305 1314 

V! ». M. S. t „„ ^ ,„ M „ „.„ ^ » ~ - ~ ~ _ _. _ 



«« *» V.! „„ 01u ^ Hi , n . „ - - - = ~ _ _. _. 

1413 1422 ldn 

He A ^ Ly . ^ M . t ^ Glu Qi - ~ ~ --- ... ___ ... 

s ™»z * ~ « - - » « « ATC ^m. 

Hen. Leu ryr Gly « lu Pro ^ ~ "~ ™ ~ ~ ~- --- ~ --- 
«A AAO^ACC GAT CTC^ « ACA^CAC « CCA^ W ^ ^ ^ 

oiy io. - Asp va i Ala Gly ^ His ~ - -~ - ~- --- 

1S75 1584 15 ,, 

GAC CCA ATA AGO GGT TCC GTG TTC AAC CCG ARr ^ 1620 

— ... _ ^ CCG CTC AAG CGA TTC CTC ATG GGA 

Aap Ala He A „ Gly s „ Val phe ^ -- ~- -~ --- -- 

^er vol Lys Gly Phe Val Met Gly 

Figure 14C( Continued) 
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1629 163 8 1647 i«c 

~-- - - - - « - atc aaa AGG ^ err cca^c m ^ 

^ ^ ^ ^ n " ^ ^ ™ ™ ^ l: ' n ^ 

«C CCA^ « « X 2J ACT ^ nT „ ™ CAA «™ ATA AAC^TAC 
A.P Cly ^ a. L y , Ser Ph . Ala L"u £ £ ^ ~ ~ ~ ~ - 

CCA CCO 1 ^ cac OXC 1 ^ «C ACA^CTC T=C CAc'IS AAC TAc'S CCC CCC^ 

A1. M. cy S Hi. a, p a,„ L"u" s :~ ~ ~ - - - ; - 

CCT CAT^ AAX AAG^GAA «. ACC^ CAA GAA^ AAA ^ ^ AAA^CTG 
Ala Asp Lya Ly8 , y , Glu ^ ^ - ~ ~- --- --- -.. 

C !3 ^ ^ ~* - - ccr or,'™ ttc m ™ ^ ggg^cag 

Ala oiy Ala «. Lau ^ u ^ ser cla aly ~ ^ — — ~ --- 

1899 1S08 ig 17 

~ ™ !!? ™ ^ ^ AAT TTC AAC GAC AAC TCC TAC AAC GCC COT ATC^TCG 
A.P Ph. cy. Ar 3 Th, Thr Asn £ ^ ^ ^ ^ j£ ^ ~ ~ ~ 

1953 1962 19?1 

-I- ™ « " ™ «I I* ^ AAA OT CAC^ ATA OAC^ TTC AAt"" 

ne a,„ cly Pho ASP rrr ciu £ £ £ c^ £ n"; ^ IZ ~ ~ 

2007 2016 2025 , rtn . 

^^^^^^^---CAC^C^^^^ 
His Lys Cly L«u I le r. y , u,u Ax ff Ly8 ^ ~ ~ ~ ~ - ~ ~ 

OCT CAA 2 ^ ATC AAA^ CA C ^ „ ^ ^ ^ ^ ^ 

Ma Giu G iu ii. Ly . Ly . „ ia ^ Glu Phe ^ ;~ --; -~ ~- --- 

2115 2124 2133 3 . 

CCO TTC AT. CTT AAA CAC CAC OCA COT OCT CAT CCC ^ ^ ^ GTG^OTG 

Ala Pha Met Leu Ly* A 3 p His Ala Gly Gly Asn Pro l'r^ t~~~ T 

y oiy Asp Pro Trp Lys Aap H e Val Val 

Figure 14a ( Continued) 
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2178 2ia? 

«• W CI, „„ Ua „„ t „ „„ ~ ~ ~ ~ ~ - ; - --- -•- ... 

2323 2232 23di 

- v.x ^ „ Wn „; ~ - - - »- _ --- ™ ... ... 

= ~ ~ ~ « « - «• « - ~ s « « „ 3. 

oi, „. „. ^ „ p „ ^ - ~ ----- - - --- 



Figure 14C( Continued) 



WO 98/24799 



34/46 



PCT/US97/22623 



1 



CTT TTA TTG ATC GTT GAG CTC TCT TTC 3TT CTC TTT r« *r-r „ 
Leu Leu Leu H e v «i n r A AGT GAC GAG TTC 

He Val Glu Leu Ser Phe v.l Leu P he Ala Ser A S p Glu Phe 

GTG AAA GTG GAA AAC GGA AAA TTC GCT CTG AAC GGA m r», 
Val Lys Val Gin a ^ *** GAA TTC AGA TTC 

y S Va! «« Asn aly L ys Phe Ala Leu Asn Giy Lys Glu Phe ^ phe 

I" gT T C r ^ « ^ ™ "C AAC GGA ATG ATA GAC 

«y Ser As, Asn ^ ^ Met Hifl ^ Lys -C 

z z: z z a? r atg ggt ata gtc « - - « 

Glu Ser Ala Arg Asp Met Giy „. L ys Val Leu Arg He Trp 

GGT „C CTC GAC GGG GAG AGT TAC TGC AGA GAC AAG AAC ACC TAC ATG CAT 
«y Phe Leu Asp Giy 0Iu Ser ^ Cys ^ ^ ^ ^ ^ ^ ™ ™ 

z z :z :z r c r gtg cca gaa gga *» - « c flG AGC 

Giy val ?ne Giy val Pro Glu Giy u e Ser Asn Ala Gin Ser 

z i c l°: l ctc r tac aca gtt gcg ™ Gc ° ^ - - - «* 

V Glu Arg Leu Asp Tyr Thr Val Ala Lys Ala Lys Glu Leu Giy He 

£ - z r 7i crr gtg ^ tcg gac - ™ « - - - 

Z Z Zt r T ^ GGA GGA ACC GAT ™ ™ AGA GAT 

Tyr Val Ar g Trp Phe Giy Giy Thr His His Asp Asp Phe ^ Arg Asp 

£ £ 2 r r ™ ^ tac gtc tcc m ctc - - « 

Lys Glu Glu Tyr Lys Lys Tyr Val Ser Phe Leu Val Asn His 

"i Z 7.1 I" r GGA ^ ^ TAC AGG SAA GAG CCC ACC ATC - «* 

Asn Thr Tyr Thr Giy Val Pr0 Tyr Ar g Glu Glu Pro TUr Ile Met Ala 

TGG GAG CTT GCA AAC GAA CCG CGC TGT GAG ACG GAC AAA TCG GGC »n *n 
^ Glu Leu Ala Asn Glu Pro Arg Cys Glu Thr Asn T I 

« g <-ys oil Thr Asp Lys Ser Giy Asn Thr 
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CTC GTT GAG TGG GTG PAG GAG ATG AGC TCC TAC ATA AAG AGT CTG GAT CCC 
,eu V al Blu Trp val Lyg Qlu Met ser ^ ne «T CCC 

Asn H1S Leu val Ma Val Gly Asp Qlu Gly phe phe ^ ^ ^ ^ £ 

TTC AAA CCT TAC GGT GGA GAA GCC GAG TGG GCC TAC AAC GGC TGG TCC GGT 
Phe Lys Pro ^ GIy Qly Glu AU ^ Ma ^ ^ ^ TCC GGT 

GTT GAC TGG AAG AAG CTC CTT TCG ATA GAG ACG GTG GAC TTC GGC ACG TTC 
Val Asp Trp Lys Lys Leu Leu Ser xle Glu Thr Val Aap pfae ^ £ ™ 

CAC CTC TAT CCG TCC CAC TGG GGT GTC AGT CCA GAG AAC TAT GCC CAG TGG 
His .eu Tyr Pro Ser .I. Trp Gly Val Ser Pro Glu Asn Tyr A la Gin Trp 

GGA GCG AAG TGG ATA GAA GAC CAC ATA AAG ATC GCA AAA GAG ATC GGA AAA 
Gly Ala Lys Trp Xle Glu Asp His Xle Lys Xle Ala Lys Glu Xle G ly ^ 

CCC GTT GTT CTG GAA GAA TAT GGA ATT CCA AAG AGT GCG CCA GTT AAC AGA 
Pro val val Leu Glu Glu Tyr Gly Xle Pro Lys Ser Ala Pro ^ J J£ 

ACG GCC TAC AGA CTC TGG AAC GAT CTG GTC TAC GAT CTC GGT GGA GAT 

Thr Ala lie Tyr Arg Leu Trp Asn Asp Leu Val Tyr Asp Leu Gly Gly Asp 

Gly Al! J" IT GAA GGT ™ °* 

Gly Ala Met Phe Trp Met Leu Ala Gly He Gly Glu Gly Ser Asp Arg Aap 

GAG AGA GGG TAC TAT CCG GAC TAC GAC GGT TTC AGA ATA GTG AAC GAC GAC 
Glu Arg Gly Tyr Tyr Pro Asp Tyr Asp Gly Phe Arg lie Val Asn Asp Asp 

Z T T MA ^ TAC GCG ™ «« TTC AAC ACA GGT 

Ser Pro Glu Ala Glu Leu lie Arg Glu Tyr Ala Lys Leu phe Asn Thr Qly 

GAA GAC ATA AGA GAA GAC ACC TGC TCT TTC ATC CTT CCA AAA GAC GGC ATG 
GAG ATC AAA AAG ACC GTG GAA GTG AGG GCT GGT GTT TTC GAC TAC AGC AAC 



Figure 15b (continued) 
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Glu He Lys Lys Thr val Glu val Arg Ala Gly Val Phe Asp ^ s „ ^ 



IT ^ ^ ^ GAT CTG TTT GAA AAT GAG 

Thr Pne Glu Lys Leu ser val Lyg val Qlu Asp ^ *« «« 

ue o G " Hi S T r r tac gga att tac ggc m gat ctc gac ** « 

Glu Hi S Leu oly Tyr Gly lle ^ Giy Fhe Agp ^ ^ ^ ^ 

ATC CCG GAT GGA GAA CAT GAA ATG TTC CTT GAA GGC CAC TTT CAG GGA AAA 
Ue Pro Asp 01y Giu Hia Glu Met phe Leu phfi oifl £ 

Thr T T ^ ^ ^ ^ ° TG GT ° GAA GCA «» 'AC GTG 

Thr val Lys Asp Ser n e Lys A1 a Lys Val v al Asn Glu Ala Arg Tyr Val 

CTC GCA GAG GAA GTT GAT TTT TCC TCT CCA GAA GAG GTG AAA AAC TGG TGG 

Leu Ala Qlu Glu Val Asp Phe Ser Ser Pro Glu Glu Val Lys Asn Trp Trp 

AAC AGC GGA ACC TGG CAG GCA GAG TTC QGG TCA CCT GAC ATT GAA TGG AAC 
Asn ser Gly Thr Trp am Ala Glu Phe Gly SerPro Asp Ile Glu Trp Agn 

OOP GAG GTG GGA AAT GGA GCA CTG CAG CTG AAC GTG AAA CTG CCC GGA AAG 
Gly Glu val Gly Asn Gly Ala Leu Gin Leu Asn Val Lys Leu Pro Gly Lyg 

AGC GAC TGG GAA GAA GTG AGA GTA GCA AGG AAG TTC GAA AGA CTC TCA GAA 
Ser Asp Trp Glu Glu Val Arg Val Ala Arg Lys Phe Glu Arg Leu Ser Glu 

TGT GAG ATC CTC GAG TAC GAC ATC TAC ATT CCA AAC GTC GAG GGA CTC AAG 
Cys Glu He Leu Glu Tyr Asp lie Tyr lie Pro Asn Val Glu Gly Leu Lys 

GGA AGG TTG AGG CCG TAC GCG GTT CTG AAC CCC GGC TGG GTG AAG ATA GGC 
Gly Arg Leu Arg Pro Tyr Ala Val Leu Asn Pro Gly Trp Val Lys lie Gly 

CTC GAC ATG AAC AAC GCG AAC GTG GAA AGT GCG GAG ATC ATC ACT TTC GGC 
Leu Asp Met Asn Asn Ala Asn Val Glu Ser Ala Glu He He Thr Phe Gly 

GGA AAA GAG TAC AGA AGA TTC CAT GTA AGA ATT GAG TTC GAC AGA ACA GCG 
Gly Lys Glu Tyr Arg Arg Phe His . Val Arg Ue Glu Phe Asp Arg Thr Ala 



Figure L5C( continued) 
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GGG GTG AAA GAA CTT CAC 
Gly Val Lys Glu Leu His 

GGA CCG ATT TTC ATC GAT 
Gly Pro He Phe lie Asp 

TGA 1931 
END 



ATA GGA GTT GTC GGT 
He- Gly Val Val Gly 

AAT GTG AGA CTT TAT 
Asn Val Arg Leu Tyr 



GAT CAT CTG AGG TAC GAT 
Asp His Leu Arg Tyr Asp 

AAA AGA ACA GGA GGT ATG 
Lys Arg Thr Gly Gly Met 



Figure 15a(continued) 
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20 



180 
60 



240 
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Figure No. l^Thermocoga maritima MSB8(6gb4) 
1 « AAA AGA ATC GAC CTG AAT GOT TTC TGG AGC GTT AGG GAT AAC GAA GGG AGA TTT TCG 

i H.C Mr. Arg xi. ASP Leu Asn Gly Phe Trp ser val Arg Agp Agn Giu Qiy ™ £ - 

« h"s " C ^ ™ «° GAA ^ «C AGA GAG TGG ATC 

« Hx. Pro Tyr val Gly Met Asn Glu Asp Leu Phe Lye clu He Glu ^ Ar 3 Glu ™ £ 

1.1 TAC GAG AGG GAG TTC GAG TTC AAA GAA GAT GTG AAA GAG GGG GAA CGT GTC GAT CTC GT^ 
« Tyr .Olu Arg Glu Phe aiu Phe Lys clu Asp Val Lys flIu Gly Glu Arg ^ £ ™ £ 

" l ^ 2 T tT GAT ™ - ^ - 30. 

Gly val Asp Thr Leu Ser Asp Val Tyr Leu Asn Gly Val Tyr Leu ^ ^ ^ 

301 GAA GA C ATG TTC ATC GAG TAT CGC TTC GAT GTC ACG AAC GTG TTG AAA GAA AAG AAT CAC 
Id Clu Asp Mec Phe Ile Glu ^ Arg phe A3p val fliu l ^ «C 

m Leu Z r T ^ ^ ^ ^ ° AG CA ° **= TAC GGG 

Leu Lys Val Tyr Ue Lys Ser Pro rl . Ar 9 Val Pro Lys T hr Leu Glu Gin Asn Tyr Gly 

l" "l ^ G1 C IT ^ ^ ^ GCC CAG TAT TCG TAC 

Val Leu Gly Gly pro Glu Asp pro ne ^ ^ ^ ^ a ^ ^ ^ ^ 

"I Ty Z T r T ACA AGC GGT ^ ™ ^ ™ TAC CTC GAG 

Gly Trp Asp Trp Gly Ala Ar 9 He Val Thr Ser Gly He Trp Ly, Pro Val Tyr Leu Glu 

IT, Val Z r T CGT 00 GAT ^ GCT TAT ^ ™ «* «» GGG - CAT 

Val Tyr Ar 9 Ala Arg Leu Gin Asp Ser Thr Ala Tyr Leu Leu Glu Leu Glu Gly Lys Asp 

2 I' Z 71 ^ ^ " C GTA ^ "* ^ ™ ATT GTG GAA GTT TAT 

Ala Leu Val Ar 3 Val Asn Gly Ph e Val His Gly Glu Gly Asn Leu lie Val Glu Val Tyr 

»" vll 21 gT T ™ " A GGG GAG m GAA W **= «* G - -G CTC TTC 720 

Asn Gly Glu Lys Ile Gly Glu Phe Pro Val Leu Glu Lys Asn Gly Glu Lys Leu P he 2,0 

'21 GAT GGA GTG TTC CAC CTG AAA GAT GTG AAA CTA TGG TAT CCG TGG AAr w , 

241 Aso Glv v»i D v, ... G ^ GTG GGG AAA C CG 780 

Asp Gly val Phe „ 1S ^ Lys Asp Val Lys Leu ^ ^ ^ ^ ^ ^ ^ 
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180 

600 
200 

660 
220 
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= ^zzzzzzzzzzzzzzzxz ;:: 
= ™™—-zzzzzzzzzz z 

SOI TTC ATA TTC GAA ATC AAC GGT GAG AAA GTC TTC - - »„ 

- * - - - - - «. „ « ~ - - « :z z z z z z z 

™: r: = ™i r: ~ z: z z z z z z r ™ ™ - - - - 

Y Met Val Trp qi„ Asp Phe Met T yr Ala Cys Leu 3 80 
1141 GAA TAT CCG GAT CAT CTT CCG TGG TTC AGA AAA CTC GCG 

~ * ~ », « „ rh , ^ ly . - z z z z z z z z 
« = ™ = Z Z z z z z z z z z z z z z z ■= 

- z - = = = - z z z z z z z z z z z z z z 

" ~ z z z z z z z z z z z z r - ™ » - ~ - 

ie Lys Ala Glu ^lu Asp Pro Ser Thr Pro Tyr 460 
1381 TGG CCA TCC AGT CCA TAC GGC GGT GAA AAA GCG aap *~ 

.« - ~ ,„ ,„ ray „, „„ z z z z z z z z z z z z: 
« z z z z z z z z z z z z z ~ - ~ « - « - 

rp Met Asn Tyr Glu Asn Tyr Glu Lys Asp Thr Gly Arg 500 

"01 TTC ATC AGC GAG .TTT GGA TTT CAG GGT GCT CCC CAT CCA GAC ^ 

"» 116 S " CI- Gly Phe G1B Gly Ma p „ Z 2 ™ *» ™ GAG TTT TCA 1560 

y *-o H ls Pro Glu Thr lie Glu Phe Phe Ser 520 

1561 AAA CCC GAG GAA AGA GAG ATA TTC CAT CCC CTC «^ 

■» *. - «. «. «, „. n. Z Z Z Z Z Z Z Z Z Z Z ra ~ 

uys his Asn Lys Gin Val Glu 540 
Figure 16b(continued) 
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660 



1*81 CGA GAA QAA GGG AGA AAA GGT ATT CGA AAA GAC T-A C*r „ 
"1 Arg Glu Glu Gly Arg Lya Gly ^ ^ 00 **= ACT «=C AGA CGG 20,0 

V rg L ya Gly He Arg Lya Aap Leu Gin Asn Gly Thr Pro Ser Arg Arg 680 



2041 TGT GAG TTT GGT TGA 2055 
681 Cys Glu Phe Gly End 68 5 



Figure 16 C< continued) 
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Figure No. i*,Bankia gouldi (37 3P 4) 

i 1 z £ r r r A cta atg m agg ot acg tat ~ a «* *» ™ *» « 

Met Lys Ly s As „ Leu Leu Mee phe Lyg ^ Leu Thr ^ ^ ^ ^ ^ ^ 



60 
20 



ao 

300 
100 

360 



•1 CTC TCA CTA AGT TCA GTA OCT CAA TCT CCT GTA GAA AAA CAT GGC CGT TTA CAA GTT GAC 1M 

21 Leu Ser Leu Ser Ser Val Ala alB Ser Pro m 61u Lys His Giy ^ £ £ £ »C 

»1 GGA AAC CGC ATT CTT AAT GCG TCT GGA GAA ATT ACG AGC TTA GCT GGT AAC AGC CTC TTT 1,0 

«l <ny Asn ats xle Leu Asn AU Ser Gly Glu He Thr Ser Leu Ala oi y A S „ ser Leu Z 60 

1.1 TGG AGT AAT GCT GGA GAC ACC TCC GAT TTT TAT AAT GCA GAA ACT GTT GAT TTT TTA GCA M , 

«1 Trp Ser AS „ Ala Gly Asp Thr Ser A8p Phe ^ A8n Ma Glu ™ « *<° 

241 GAA AAC TGG AAT AGC TCA CTT ATT AGA ATA GCT ATG GGC GTA AAA GAA AAT TGG GAT GGC 
61 Glu Asn Trp Asn Ser Ser Leu Xle Ar 3 IXe AU Me, Gly Val Lys Glu Asn Trp Asp Gly 

301 GCA AAT GGC TAT ATT CAT ACT CCG CAG GAG W GAA GCT AAA ATT AGA AAA GTT ATT GAT „„ 

-01 G,y Asn Gly Tyr lie Asp S er P ro GXn Glu Gin Glu Ala Lys Xle Ar 3 Lys Val Xle Asp „. 

361 GCA GCT ATT GCT AAC GGC ATA TAT CTA ATA ATA GAC TGG CAC ACT CAC GAA GCA GAG TTA 4,0 

Ala Ala lie Ala Asn Gly Xle Tyr Val Xle Xle Asp Trp His Thr Hi. Glu Ala Glu Leu 140 

X« Z T T GCA GAC CTA TAC GGA «* CCC 4 B0 

Tyr Thr Asp Glu Ala Val Asp Phe Phe Thr Arg Met Ala Asp Leu Tyr Gly Asp Thr Pro 160 

«1 AAT GTA ATG TAT GAA ATT TAT AAC GAG CCT ATA TAC CAA AGT TGG CCT GTT ATT AAG AAT S40 

"1 Asn Val Met Tyr Glu Xle Tyr Asn Glu Pro Xle Tyr Gin Ser Trp Pro Val lie Lys Asn 180 

S41 TAT GCA GAG CAA GTA ATT GCT GGT ATA CGT TCT AAA GAC CCA GAT AAT TTA ATA ATT GTA 600 

Tyr Ala Glu Gin Val xle Ala Gly Xle Arg Ser Lys Asp Pro Asp Asn Leu n. Xle Val 2 00 

z z z r r tat tct 00 ™ gtt gat gta gca tca gca gac c - - A «* «* -r sso 

Gly Thr Ser Asn Tyr Ser Gin Gin Val Asp Val Ala Ser Ala Asp Pro Xle Ser Asp Thr 220 

HI Z Z T ^ ^ ^ ^ ^ GCA m ^ =« GAT TTA AGA AAT 72 0 

Asn val Ala Tyr Thr Leu His Phe Tyr Ala Ala P he Asn Pro Hia Asp Asn L£u ^ ^ ^ 

2« Tl Ma G T T ^ "* " *" *" *" "* "* °™ ™ ™ TGG GGT ACA ATT 780 

Ala Gin Thr Ala Leu Asp Asn Asn Val Ala Leu Phe Val Thr Glu Trp Gly Thr Xle- 260 
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~ -^^-zzzzzzzzzzz z 
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y Ser Gly Leu Il a Ser A3n Lya Leu Thr ^ ^ 

•» - " r z r s - z ~ r - r r — — 

Asn Trp Asp Thr Glu Thr Ser Thr Gly ^ ^ 

"« zzzzzzzzzzzzzz-- m ~~- - 

Cys u e Ala Ala Mej . ^ ^ ^ ^ ^ jj( 

» zzztzzzzzzzzzzzr--— - 

yr A3n Phe Gin Asp L yg n e Gin Gly Ala 380 

- - - " s r ™: z z z z z r - - c - ~ « - « - - 

iyr ox y Ser Ala Asn Glv Asn c«>- * 

-uy Asn ser Thr Asn Pro He n e 40 o 

1201 TTA AGA GGC GAA Afir rrr * *~ 

- - - «, «. z z z z z z z z z z r r - ~ 

my wu Asp Tyr Asn Asn Gly 420 

12S1 TAC CTA TTA ACT ATT GAA GGT GAT TAT TCC iit 

« - - - ™„ «, - - - - » « ™ zzzzz z: 
* z z z z z z z z r r* - ™ » « - - 

«y «-g ». l„ ,„ m „ Phc Cly 01ii my Lm ^ v ^ ^ jm 

« = = " s = £ : - = - - - = z z = » - « « 

- = = = 5 = = = = z - - z z z z z z z z z z: 

Figure 17bCcontinued) 
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1621 ACT ATT ATA AGA AAT TGC GTG TTT TCT rr* r« * 

1581 GCT TTT ATT GAT TTA AAA GGA GCC TAT GOT TTT GTA TAC AGA „r .~ 

«• - ». ... „, „, M , m „„ z z ~ r :i ^ - 

Asp Phe Leu Asp Arg cly T h r G!y Phe Asn Thr 60Q 

« = = = = = = = = = = = = — - = « 

" " r: " z z z z ? r "* ot ~ ~ ™ » « « - - -° 

Asp p„ „. ,„ , sp cly TM nu ^ ^ v ^ ^ ^ ^ ^ 

^ ™ZZ22ZZZZZZZZZZZZ™ ~ 

2101 GAA GTT AAT GCT ACT GAT GCA GAT GGA ACT ATT GAT AAT ~TA AAA „ 
- «. ~ ... A, * A.p Ma A.p 01, » „. „ Z « - « » « « - 

« - 2 " ~ ^ " 2 1" " ra Iat "* - °° c ~ ~ « ~ « - ». 

AW II. ..„ S .r Thr ,„ „, r ^ Ttj My Hl< ^ ^ ^ ^ ^ ^ 
">1 ACA OAT CAA CTT AAT OGT CTT ACA OAA OGA ACT TAT ACC rr> ... „ 

» - - a,„ „, nt Er , hr ^ ™ £ s z z ™ 
« ~ ~ " 7 ~ ~ ™ ™ - « - ™ - «. ~ » » 

P uiy Ala Se. Thr Glu Thr Gin Phe Thr Leu Thr Val lie Thr Gl- n c 

ue Tnr Glu Gin Ser Pro 780 

Sar Glu Aan Cya Asp Phe Asn Thr Pro Ser Ser Thr Gl/ Leu Glu Asp Phe Asp lie Lys ~ 

"0! AAG TTT TCT AAC GTT TTT GAG TTA GGA TCT GGC GGA CCA TCT TTA AGT AAT 

^ 1TA AGT AAT TTA AAA ACA 2460 

Figure 17<V( continued) 



0 
840 

0 
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"01 L ys Phe ser Agn val phfl ^ 

X «w Ser Leu Ser Asn Leu Lye Thr 820 

2«1 TTT ACT ATT AAT TGG AAT TCr r»» , 

„„ AAT TCG CAA TAC AAT CGG TTA TAT r»» . 

■* * - «- «. - ~ «. *, «. _ ~ ~ s ™ ™ z z r: » 

"21 AAC GGT GTA CCT GAT TAT TAT ATA AAT TTA AAA CCA A« *~ 

He Pro Asn Phe Aep Gly Aap ^ ^ flao 

GTA ACA TCA GAT AAC £VT but -™. „ 

" A3n Gln He Ser Lys 920 

"61 ATTACT W(:ATTCIAGTWr 
- - - - - ^ _ n . Asn phe Lys L ^ ^ - - « « = JJC £ ACT a= 

AU Glu Asp 01u Lya Leu Ala Leu v ^ ^ o 



Figure 17a(continued) 
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Figure No. ia<v Pyrococcus furiosus VC1(7EG1) 



leader sequence: amino acids 1-24 



» 18 



27 3 6 



ATG AGC AAG AAA AAG TTC GTC ATC GTA TCT ATC TTA ACA ATC CTT TTA GTA 
Me t ser Lys Lys Lys Phe Val u . ^ ^ ^ ^ ^ ^ ^ ^ ™ 



63 72 81 30 99 

GCA ATA TAT TTT GTA GAA AAG TAT CAT ACC TCT GAG GAC AAG TCA ACT TCA AAT 
Ala II. Tyr Phe Val Glu Lys Tyr Hi 3 Thr Ser Glu Asp Lys Ser Thr Ser ^ 



117 126 135 144 1S3 



tL c ZT T cca ccc ^ ACA ACA OT Tcc ACT Acc ™ ™ ctc - g *~ 

Thr ser Ser Thr Pro Pro Gin Thr Thr Leu Ser Thr Thr Lys Val l=u Lys n. 

171 189 198 

AGA TAG CCT GAT GAC GGT GAG TGG CCA GGA GCT CCT ATT GAT AAG GAT GGT GAT 
Arg Tyr Pro Asp Asp Gly Glu Trp Pro Gly Ala Pro He Asp Lys Asp 01y Asp 

225 " 4 »« 252 261 

*** CCA GAA ^ TAC A " GAA ATA AAC CTA TGG AAC ATT CTT AAT GCT ACT 
Gly Asn Pro Glu Ph e Tyr . lie Glu He Asn Leu Trp Asn lie Leu Asa Ala Thr 

rrB 2 " 288 " 7 305 315 324 

^ GCT GA ° ATG ACG TAC AAT TTA ACC AGC GGC GTC CTT CAC TAC GTC CAA 

Gly Phe Ala Glu Met Thr Tyr Asn Leu Thr Ser Gly Val Leu His Tyr Val Gin 

" 3 ^ 351 360 369 378 

CAA CTT GAC AAC ATT GTC TTG AGG GAT AGA ACT AAT TGG GTG CAT GGA TAC CCC 
Gin Leu Asp Asn lie Val Leu Arg Asp Arg Ser Asn Trp Val His Gly Tyr Pro 

387 396 405 414 423 432 

GAA ATA TTC TAT GGA AAC AAG CCA TGG AAT GCA AAC TAC GCA ACT GAT GGC CCA 

Glu lie Phe ^ Gly Asn Lyg pro Trp Agn ^ a ^ a ^ p ^ 



441 4S0 



459 468 



ATA CCA TTA CCC AGT AAA GTT TCA AAC CTA ACA GAC TTC TAT CTA ACA ATC TCC 
Pro Leu Pro Ser Lys Val Ser Asn Leu Thr Asp Phe Tyr Leu Thr lie ser 
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TAT AAA CTT GAG CCC AAG AAC GGC CTG CCA ATT Aflr ' 540 
Tyr Lys teu Glu Pro "° CCA «C TIC GCA ATA GAA TCC TGG 

Y P ™ »- Aan Phe Ala Ile Glu Ser 



549 558 5S7 

567 576 



TTA ACG AGA GAA GCT TGG AGA ACA ACA GGA ATT AAC AGC GAT rt ^ 

G1 ' J ^ **P Arg Thr Thr 01y „7 2 ^ ^ p °*° ^ GAA 

/ Asn ser Asp Glu Gin Glu Val 



603 621 



ATG ATA TGG ATT TAC TAT GAC GGA TTA CAA CCG GCT rr, 648 

« »• «p ~ ^ ^ „ p ^ 01n ~ « « ~ - « « « 



ATT GTA GTC CCA ATA ATA GTT AAC GGA ACA CCA GTA AAT n 

»• ~ « a . I1= w A „ Cly „ ~ » « « - - « « 



711 ? 20 729 



TGG AAG GCA AAC ATT GGT TGG GAG TAT GTT GCA TTT *o a n ^ 

Lys Ala Asn «. 01y Trp Glu ^ ^ Aia £ l r r cca atc 

tC /urg iie Lys Thr prQ 



7S5 774 703 

783 792 



AAA GAG GGA ACA GTG ACA ATT rra -ran ^* .801 8l0 

CCA TAC GGA GCA T"T nfn 

L ys Glu oiy Thr Val Thr X1 . Pro ^ Gly ^ £ ™ f °" G ° A GCC ^ 

/ iy Ala Phe ile Ser Val Ala Ala Asn 



819 828 837 



ATT TCA AGC TTA CCA AAT TAC ACA GAA CTT TAC TTA GAG „n 

- •« Ssr Leu pr0 Asn ^ ^ ^ ^ - - -C GAG ATT GGA 



873 882 891 



891 900 

ACT GAG TTT GGA ACG CCA AGC ACT ACC TCC CCr r*r ™ 

~ «. - «, ». „ s „ Ite „ z z z z z z z 

927 936 9 « 954 

AAC ATA ACA CTA ACT CCT CTA GAT AGA CCT CTT ATT TCC TAA , ■ 
Asn Xle Thr Le u Thr Pro L eu Asp Ar 9 Pro Leu lie Ser 



Figure 18b (continued) 
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